CHAPTER 12

                      ASSEMBLY LANGUAGE PROGRAMMING


This book has consistently presented programming techniques that reduce the
size of your programs, and make them run faster.  Most of the discussions
focused on ways to write efficient BASIC code, and several showed how to
access system interrupt services.  Where speed was critical or BASIC was
inflexible, I presented subroutines written in assembly language.
   Assembly language is the most powerful way to communicate with a PC, and
it offers speed and flexibility unmatched by any other language.  Indeed,
assembly language is in many ways the ultimate programming language because
it lets you control fully every aspect of your PC's operation.  Anything
that a PC is capable of doing can be accomplished using assembly language. 
This final chapter explains assembly language in terms that most BASIC
programmers can understand.
   Why, you might ask, would a BASIC programmer be interested in assembly
language?  After all, the whole point of a high-level language such as
BASIC is to shield the programmer from the underlying hardware.  Without
having to worry about CPU registers and memory addresses, a BASIC
programmer can be immediately productive, and probably write programs with
fewer initial bugs.  However, there are three important reasons for using
assembly language:

   þ To speed up selected portions of a program

   þ To reduce the size of a program

   þ To perform services that BASIC simply cannot

It is important to understand that any high-level language will benefit
from the appropriate use of assembler.  And while it is possible to write
a major application using only assembly language, the increased complexity
and added time to develop and debug it are often not worth the trouble. 
Using a high-level language--especially BASIC--for the majority of a
program and then coding the size- and speed-critical portions in assembly
language often is the most practical solution.
   Many BASIC programmers mistakenly believe that to achieve the fastest
and smallest programs they should learn C.  In my opinion, nothing could
be further from the truth.  Assembly language is barely more difficult to
use than C, and in fact the code is often more readable.  Further, no
high-level language can come even close to what raw 8086 code can achieve. 
If you truly desire to become an advanced programmer, you owe it to
yourself to at least see what assembly language is all about.  I believe
there is no deeper satisfaction than that gained by understanding fully
what your computer is doing at the lowest level.
   This chapter assumes that you already understand basic programming
concepts such as variables, arrays, and subroutines.  As we proceed, most
of the examples will provide parallels to BASIC where possible.  But please
remember one important point: There is nothing inherently difficult about
assembly language.  Attitude is everything, and if you can think of
assembler as a stripped-down version of BASIC, you will be successful that
much sooner.
   For ease of reading, I will refer to the 8088 microprocessor used in the
IBM PC throughout this chapter.  However, everything said about the 8088
also applies to the 8086, the 80286, the 80386/486, and the NEC V series
found in some older PC compatible computers.  I will also use the terms
assembly language and assembler interchangeably, although assembler can
also be used to mean the program that assembles your source files.
   All of the examples in this chapter are meant to be assembled with the
Microsoft Macro Assembler (MASM) version 5.1 or later.  MASM requires that
you save your source files as standard ASCII text, and most word processor
programs can do this.
   Some of the examples in this chapter are derived from those that used
CALL Interrupt in Chapter 11.  In most cases I have not bothered to restate
the same information from that chapter, and you may want to refer back for
additional information.
   Finally, many entire books have been written about assembly language,
and there is no way I can possibly teach you everything you need to know
here.  Rather, my intent is to provide a gentle introduction to the
concepts using practical and useful examples.


AS EASY AS BASIC
================

Assembly language uses the same general form as a BASIC program.  That is,
commands are performed in sequence until a GOTO or GOSUB is encountered. 
In assembly language these are called Jump and Call, respectively.  Many
BASIC instructions have a direct assembler equivalent, although the syntax
is slightly different.  One important difference, however, is that the 8088
microprocessor can operate on integer numbers only.  Another is that for
the most efficiency, you are limited to only a few working variables.  I
will begin by showing some rudimentary assembly language instructions, so
you can see how they are analogous to similar commands in BASIC.  Consider
the following BASIC program fragment:

   AX = 5

Here, the value 5 is assigned to the variable AX.  The 8088 has several
built-in variables called *registers*, and one of them is called AX.  To
move the value 5 into the AX register you use the Mov instruction:

   Mov AX,5

As with BASIC, the destination variable in an assembly language program
is always shown on the left, and the source is on the right.  Now consider
addition and subtraction.  To add the value 12 to AX in BASIC you do this:

   AX = AX + 12

The equivalent 8088 command is:

   Add AX,12

Again, the variable or register on the left is always the one that receives
the results of any adding, moving, and so on.  Subtraction is very similar
to addition, replacing Add with Sub:

       BASIC:  AX = AX - 100
   Assembler:  Sub AX,100

Comparing and branching in assembly language is also quite similar to
BASIC.  But instead of this:

   AX = AX + 2
   IF AX > 60 GOTO Finished

You'd do it in assembler this way:

   Add AX,2
   Cmp AX,60
   Ja  Finished

This tells the 8088 to add 2 to AX, then compare AX to 60, and finally to
*jump if above* to the code at label Finished.  There are several kinds of
conditional jump instructions in assembly language, and they often follow
a comparison as shown here.  In fact, all you can really do after a compare
is jump somewhere based on the results.  And while there is no direct
equivalent for this BASIC statement:

   IF AX = 10 THEN BX = BX - 1

You can change the strategy to this:

   IF AX <> 10 GOTO Not10
   BX = BX - 1
   Not10:
    .
    .

Now a direct translation is simple:

   Cmp AX,10
   Jne Not10
   Dec BX
   Not10:
    .
    .

Jne stands for *Jump if Not Equal*.  Also, notice the command Dec, which
means decrement by 1.  This is one case in which an assembler instruction
is actually more to the point than its BASIC counterpart, and is equivalent
to the BASIC command BX = BX - 1.  While Sub BX, 1 would work just as well,
using Dec is faster and generates less code, and we all know that speed is
the name of the game.
   The complement to Dec is Inc, short for *increment by one*.  You can use
Inc and Dec with most of the 8088's registers, as well as on the contents
of any memory location, which brings up an important issue.  At some point,
many programs will require more variables than can be held within the CPU's
registers.  All of the available free memory in a PC can be used as
variable storage, with only a few limitations:

   þ You must first tell the assembler how much space to set aside, much
   like you would when dimensioning an array. Moreover, MASM is pretty
   friendly and lets you use names for the memory locations.  In fact, in
   most cases you do not need to know the memory addresses variables will
   be stored in--the assembler handles that for you as well!

   þ Adding, subtracting, incrementing, and decrementing are all much
   faster when done within registers.  When an operation is performed on
   a memory variable, it must first be fetched by the CPU, manipulated, and
   then stored again.  Because the registers are within the CPU chip, those
   extra steps are not needed.  The steps to retrieve and then store memory
   variables is handled transparently by the 8088; I mention this merely
   to explain why register operations are faster.

   þ Some operations can be done only using registers.  If you want to
   multiply the memory variable Counter by 12, you first have to move the
   variable into AX, do the multiplication, and then move it back into
   memory again.  And if AX is currently holding a needed value, it must
   be saved before multiplying and restored again afterward.  Although
   assembly language is not as complicated as many people think, it surely
   can be tedious at times.

Besides the CPU registers and conventional memory addresses, a special
portion of memory called the *stack* is also available for storage.  The
stack is much like the temporary memory on a four-function calculator, and
it is often used to store intermediate results.  The stack is also commonly
used to pass variables between programs, because all programs can access
it without having to know exactly where in memory it is located.  Again,
assembly language doesn't usually require you to deal with absolute memory
addresses at all--especially for subroutines that will be added to a BASIC
program.  The only exceptions might be when writing directly to the display
screen, or when looking at low memory, perhaps to see whether the Caps Lock
key is engaged.


SPAGHETTI CODE?

To write a routine that converts lower case letters to capital letters in
BASIC, you might use something like this:

   IF AL$ => "a" AND AL$ <= "z" THEN
     AL$ = CHR$(ASC(AL$) - 32)
   END IF

In assembly language each compare must be done separately, followed by a
jump based on the results.  Let's rephrase the BASIC example slightly:

   IF AL$ < "a" GOTO Done
   IF AL$ > "z" GOTO Done
   AL$ = CHR$(ASC(AL$) - 32)
   Done:
    .
    .

Now a conversion to assembler is easy:

   Cmp AL,"a"     ;compare AL to "a"
   Jb  Done       ;Jump if Below to Done
   Cmp AL,"z"     ;compare AL to "z"
   Ja  Done       ;Jump if Above to Done
   Sub AL,32      ;subtract 32 from AL
   Done:
    .
    .

Notice how the assembler allows the use of quoted constants.  When it sees
a character or string in double or single quotes, it knows you mean to use
the character's ASCII value.  Unlike BASIC with its strong variable typing
that prevents you from performing numeric operations on a string, assembly
language has very few such restrictions.  Also notice how much jumping
around is necessary to accomplish even the simplest of actions.
   As I mentioned earlier, assembly language can certainly be more tedious
than BASIC, although the logic is not really that different.  Such frequent
jumping around is called spaghetti code by some programmers, and it is
often used in a derogatory fashion when discussing BASIC's GOTO statement. 
But this is the way that computers work, and I am amused by programmers who
argue so strongly against all use of the GOTO command.  While nobody could
seriously object to a well organized and structured programming style, all
programs are eventually converted to equivalent assembly language jumps and
branches.


THE REGISTERS
=============

There are six general purpose registers available for you to use: AX, BX,
CX, DX, SI, and DI.  Each register may be used for the most common
operations like adding and subtracting, although some are specialized for
certain other operations.  However, most of the registers also have a
specialty.  For example, AX is the only register that can be multiplied or
divided.  The A in AX stands for Accumulator, and it often used for math
operations such as accumulating a running total.  Also, several assembler
instructions result in one byte less code when used with AX, when compared
to the same instructions using other registers.
   The B in BX means Base, and this register is frequently used to hold the
base address of a collection of variables or other data.  If you have a
text string in memory to be examined, you could put the address of the
first character in BX.  The rest of the string can then be found by
referencing BX.
   BX can also be used to specify computed addresses using addition or
subtraction.  For example, the instruction Mov AX,[BX+4] means to load AX
with the word four bytes beyond the address held in BX.  Likewise, the
instruction Add DL,[BX+SI-10] adds the value of the byte at that computed
address to the current contents of DL.  You may use BX this way with either
a constant number, the SI or DI register, or one of those registers and a
constant number.  However, only addition and substraction may be used, as
opposed to multiplication or division.  I will return to computed and
indirect addressing later in this chapter.
   The C in CX stands for Count, since CX is most often used as the counter
in an assembly language FOR/NEXT loop.  In fact, the assembly language
command Loop uses CX to perform an operation a specified number of times. 
The comparison below illustrates this.

   BASIC:
        FOR CX = 1 TO 5
          GOSUB BeepTone
        NEXT

   Assembler:
        Mov  CX,5
        Do:  Call Beep_Tone
        Loop Do

Here, the Loop instruction automatically branches to the label Do: CX
times.  That is much faster and more efficient than this:

   Mov  CX,5
   Do:  Call Beep_Tone
   Dec  CX
   Cmp  CX,0
   Jne  Do

The DX register is a general purpose Data register, and is named
accordingly.  DX is also used in conjunction with AX when multiplying and
dividing.
   The last two general purpose registers are SI and DI.  SI stands for
Source Index, while DI means Destination Index.  It is not hard to guess
that these registers are well suited for copying data from one memory
location to another.  The 8088 has a rich set of instructions for moving
and comparing strings, using SI and DI to show where they are.
   Like BX, SI and DI may be used with a constant offset such as [SI+100]
to compute a memory address, or with a constant value and/or BX.  But
again, SI and DI are still general purpose registers, and they can be used
for common chores as well.  In many situations it really doesn't matter
whether you use BX or DI or SI or AX.
   There are two specialized registers called BP and SP.  BP (Base Pointer)
is another Base register like BX, only it is intended for use with the
stack.  When you need to access data on the stack, BP is the most
appropriate register to use.  Like BX, BP can reference computed addresses
with a constant offset, with SI or DI, or with a constant and SI or DI.
   The SP (Stack Pointer) register holds the current address of the stack,
and it should never be altered unless you have a very good reason to do so.
   The last four registers are the segment registers, but I will mention
them only briefly right now.  As you undoubtedly know, the 8088 used a
segmented architecture; although it can utilize a megabyte of memory, it
can do so only in 64K portions at a time.  The CS register holds the
current Code Segment (your program code), DS holds the Data Segment (your
memory variables), SS holds the Stack Segment, and ES is an Extra Segment
that is often used to access arrays located in far memory.
   Each of the 8088 registers can hold one word (two bytes), allowing you
to store any integer number between 0 and 65535.  This range of values can
also be considered as -32768 to 32767.  But AX, BX, CX, and DX may also be
used as two separate one-byte registers with a range of either 0 to 255 or
-128 to 127.  One byte is often sufficient--for example, when manipulating
ASCII characters--and this ability to access each half individually
effectively adds four more registers.  Remember, the more variables you can
keep within registers, the faster and more efficient a program will be.
   When using the registers separately, the two halves are identified by
the letters H and L, for High and Low.  That is, the high portion of AX is
referred to as AH, while the low portion of DX is called DL.  This would
be represented with BASIC variables as follows:

   AX = AL + 256 * AH

Each half can also be represented as bit patterns:

              AX
   ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
    1011  0110  0111  0101
   ÀÄÄÄÄÄÄÄÄÄÄÙÀÄÄÄÄÄÄÄÄÄÄÙ
        AH          AL

Notice that SI, DI, BP, and SP cannot be split this way, nor can the
segment registers CS, DS, SS, and ES.
   There is also another register called the Flags register, though it is
not intended for you to use directly.  After performing calculations and
comparisons, certain bits in the Flags register are set or cleared by the
CPU automatically, depending on the results.  For example, if you add a
register that holds the value 40000 to another register whose value is
30000, the Carry flag will be set to show that the result exceeded 64K. 
The 8088 flags are also set or cleared to reflect the result of a Cmp
(Compare) instruction.  Although you will not usually access these flags
directly, they are used internally to process Jne, Ja, and the other
conditional jump commands.


VARIABLES IN ASSEMBLY LANGUAGE
============================== 

All of the example routines shown so far have used the 8088 registers as
working variables.  Indeed, using registers whenever possible is always
desirable because they can be accessed very quickly.  But in many real-
world applications, more variables are needed than can fit into the few
available registers.  As with BASIC, MASM lets you define variables using
names you choose, and you must also specify the size of each variable.
   The first step is to define the amount of space that will be set aside
with the assembler instructions DB and DW.  These stand for Define Byte and
Define Word respectively, and they allocate either one byte of storage or
two.  You can also use DD to define a double word long integer variable. 
Notice that these are not commands that the 8088 processor will execute;
rather, they inform the assembler to leave room for the data.  Some
examples are shown below:

   MyByte DB 12h                     ;one byte, preset to 12h
   Buffer DB 15 Dup(0)               ;fifteen bytes, all 0
   Dummy  DW ?                       ;one word (two bytes), 0
   Msg    DB "Test message",13,10    ;message, CR, LF

In the first example one byte of memory is allocated using the name MyByte,
and the value 12 Hex is placed there at assembly time.  The second example
illustrates using the Dup (duplicate) command, and tells MASM to set aside
fifteen bytes filling each with the specified value.  In this case that
value is zero.  Initialized data is an important feature of assembly
language, and one that is sorely missing from BASIC.  By being able to
allocate data values at assembly time, additional code to assign those
values at runtime is not needed.
   Filling an area with zeroes can also be accomplished with a question
mark, and this is frequently used when the value that will eventually end
up there is not known in advance.  Both do the same thing in most cases,
however using "?" implies an unknown, as opposed to an explicit zero.  You
may use whichever method seems more appropriate at the time.  The last
example shows how text may be specified, as well as combining values in a
single statement.
   Since the assembler lets you use names for your data, fetching or
storing values can be done with the normal Mov instruction like this.

   Error_Code  DB ?
   Mov Error_Code,AL

This puts the contents of register AL into memory location Error_Code. 
Getting it back again later is just as easy:

   Mov DH,Error_Code

Sometimes the assembler needs a little help when you assign variables. 
When you move AL or DH in and out of a memory location, the assembler knows
that you are dealing with a single byte.  And if you specify BX or SI as
the source or destination operand, the assembler understands this to mean
two bytes, or one word.  But when literal numbers are used, the size of the
value is not always obvious.  Consider the following:

   Mov [BX],3Ch

Does this mean that you want to put the value 3Ch into the byte at the
address held in BX, or the value 003Ch into the *word* at that address? 
There is no way for MASM to know what your intentions are, so you must
specify the size explicitly.  This is done with the Byte Ptr and Word Ptr
directives.  Here, Ptr stands for Pointer, and two examples are shown: 
   Mov Byte Ptr [BX],15
   Mov Word Ptr ES:[DI],100

The first example specifies that the memory at address BX is to be treated
as a single byte.  Had Word been used instead, a 15 would be placed into
the byte at address held in BX, and a zero would be put into the byte
immediately following.  Words are always stored with the low-byte before
the high-byte in memory.
   Memory variables are accessed using the normal complement of
instructions.  For example, to add 15 to the variable Counter you will use
Add Counter,15.  And to multiply AX by the word variable Number you will
use Mul Word Ptr Number.  In MASM versions 5.0 and later, the Word Ptr
argument is not strictly necessary.  That is, if Number had been defined
using DW, then MASM knows that you mean to multiply by a word rather than
a byte.  But earlier versions of the assembler were not so smart, and an
explicit Word Ptr or Byte Ptr was required.
   Note, however, that you must still use Byte Ptr or Word Ptr to override
a variable's type.  For example, if Value was defined as a word but you
want to access just its lower byte, you must use Mov AL,Byte Ptr Value. 
Here, stating Byte Ptr explicitly tells MASM that you are intentionally
treating Value as a different data type.  Otherwise, it will issue a non-
fatal warning error message.
   Sometimes you may want to refer to the address of a variable, as opposed
to its contents.  For example, Mov AX,Variable tells MASM to move the value
held in Variable into the AX register.  But many DOS services require that
you specify a variable's address in a register.  This is done using the
Offset operator:  Mov DX,Offset Buffer.  Where Mov DX,Buffer places the
first two bytes of the buffer into DX, using Offset tells MASM that you
instead want the starting address of the buffer.
   You can also use the Lea (Load Effective Address) command to obtain an
address, but that is less frequently used.  Although Lea DX,Buffer can be
used to load DX with the starting address of Buffer, it is a slightly
slower instruction.  Lea is needed only when an address must be computed. 
For example, the instruction Lea SI,[BX+DI] loads SI with the sum of the
BX and DI registers.  You may notice that Lea can provide a shortcut for
adding or subtracting certain register combinations.  Although this use of
Lea is uncommon, Lea can replace the following two instructions:

   Mov SI,BX
   Add SI,DI

To subtract two registers or a register and a constant value you could use
Lea AX,[BX-DI] or Lea SI,[BP-10].


CALCULATIONS IN ASSEMBLY LANGUAGE
=================================

When adding or subtracting you may use two registers, or a register and a
memory variable.  It is not legal to specify two memory variables as in
Add Var1,Var2.
   Multiplying and dividing are not so flexible; only AL and AX may be
multiplied.  When dividing, the numerator must be either in AX, or the long
integer comprised of DX:AX.  In this case, DX holds the upper word and AX
holds the lower one.  However, you may multiply or divide these registers
using either a register or a memory location.  Because of this restriction,
it is not necessary to specify the target operand size.  That is, Mul CL
means to multiply AL by CL leaving the result in AX, and Div WordVariable
divides DX:AX by the contents of WordVariable leaving the result in AX and
the remainder in DX.  Although you could use the commands Mul AL,CL and Div
AX,WordVariable, this is not necessary or common.
   All of the allowable combinations for multiplying and dividing are shown
in Figure 12-1.


Instruction          Operand    Result    Remainder
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ     ÍÍÍÍÍÍÍ    ÍÍÍÍÍÍ    ÍÍÍÍÍÍÍÍÍ
Mul ByteRegister        AL        AX         n/a
Mul ByteVariable        AL        AX         n/a
Mul WordRegister        AX       DX:AX       n/a
Mul WordVariable        AX       DX:AX       n/a

Div ByteRegister        AX        AL          AH
Div ByteVariable        AX        AL          AH
Div WordRegister      DX:AX       AX          DX
Div WordVariable      DX:AX       AX          DX

Figure 12-1: The allowable register/memory combinations for multiplying and
dividing.


In Figure 12-1 ByteRegister means any byte-sized register such as AL or
CH; WordRegister indicates any word-sized register like CX or BP. 
Likewise, ByteVariable and WordVariable specify byte- and word-sized
integer memory variables respectively.
   It's important to understand that you must never divide by zero, because
that will generate a critical error.  Because the result from dividing by
zero is infinity, the 8088 has no way to handle that--it can't simply
ignore the error.  Therefore, dividing by zero causes the CPU to generate
an Interrupt 0.  In a BASIC program that error is routed to BASIC's
internal error handling mechanism which either invokes the ON ERROR handler
if one is in effect, or ends your program with an error message.  In a
purely assembly language program, DOS intervenes printing an error message
on the screen, and then it ends the program.
   Related to division by zero is dividing when the result cannot fit into
the destination register.  For example, if AX holds the value 20000 and you
divide it by 2, the resulting 10000 cannot fit into AL.  Since this is
another unrecoverable error that cannot be ignored, the 8088 generates an
Interrupt 0 there as well.
   Besides the Div and Mul instructions, there are also signed versions
called Idiv and Imul.  Where Div and Mul treat the contents of AX or DX:AX
as an unsigned value, Idiv and Imul treat them as being signed.  You'll use
whichever command is appropriate, so the 8088 knows if values having their
highest bit set are to be treated as negative.  BASIC always uses Idiv and
Imul in the code it generates, since all integer and long integer values
are treated by BASIC as signed.
   Because only AX and DX:AX may be used for multiplying and dividing, this
affects your choice of registers.  The short example that follows shows how
you might select registers when translating a simple BASIC-like expression
that uses only integer (not long integer) variables.


   BASIC:
        Result = (Var1 + Var2 * (Var3 - Var4)) \ 100


   Assembler:
        Mov  AX,Var3          ;work from the innermost level out
        Sub  AX,Var4          ;so first perform Var3 - Var4
        Imul Word Ptr Var2    ;then multiply that by Var2
        Add  AX,Var1          ;add Var1 to what we have so far
        Mov  DX,0             ;next prepare to divide DX:AX
        Mov  CX,100           ;use CX for the divisor
        Idiv CX               ;do the division
        Mov  Result,AX        ;then assign Result ignoring the
                              ;  remainder left in DX


Because dividing by an integer value uses both DX and AX, it is necessary
to clear DX explicitly as shown unless you are certain it is already zero. 
The use of CX to hold the value 100 is arbitrary.  If CX were currently in
use, any available word-sized register or memory location could be used. 
If you compile this program statement and view the resultant code using
CodeView, you will see that BASIC does an even better job of translating
this particular expression to assembly language.


STRING PROCESSING INSTRUCTIONS
==============================

Besides being able to add, subtract, multiply, and divide, the 8088
provides four very efficient instructions for manipulating strings and
other data in memory.  Movs copies, or moves a string from place to
another; Cmps compares two ranges of memory; Stos fills, or stores one or
more addresses with the same value; and Scas scans a range of memory
looking for a particular value.  These instructions require either a byte
or word specifier.  For example, you would use Movsb to copy a byte, and
Cmpsw to compare two words.
   There are two important factors that contribute to the power and
usefulness of these string instructions: each is only one byte long, and
they automatically increment or decrement the SI and DI registers that
point to the data being manipulated.  Thus, they are both convenient to
use, and also very fast.  Because it is common to access blocks of memory
sequentially a byte or word at a time, automatically advancing SI and DI
saves you from having to do that manually with additional instructions. 
For example, after one pair of words has been compared, SI and DI are
already set to point at the next pair.
   You can also specify that SI and DI are to be decremented by first using
the Std (Set Direction) command.  The Direction Flag stores the current
string operations direction, which is either up or down.  If a previous Std
was in effect, then you'd use Cld (Clear Direction) to force copying and
moving to be forward.  In fact, BASIC *requires* you to clear the direction
flag to forward before returning from a routine that set it to backwards.


MOVS AND CMPS

Movs and Cmps use the DS:SI register pair to point to the first range of
memory being copied or compared, and ES:DI to point to the second range. 
Each time a byte is being copied or compared, SI and DI are incremented or
decremented by one to point to the next address.  And when a word is being
accessed, SI and DI are incremented or decremented by two.
   Notice that there is no protection against SI or DI being incremented
or decremented through address zero, nor is there any indication that this
has happened.  Also notice that the name Movs is somewhat of a misnomer. 
To me, moving something implies that it is no longer at its original
location.  Movs does not alter the source data at all--it merely places a
new copy at the specified destination address.


SCAS AND STOS

Scas compares the value in AL or AX with the range of memory pointed to
by ES:DI.  That is, Scasb compares AL and Scasw uses AX.  Stos also uses
ES:DI to show where the data being written to is located; Stosb stores the
contents of AL in the address at ES:[DI] and then increments or decrements
DI by one.  Likewise, Stosw stores the value in AX there and increments or
decrements DI by two.


REPEATING STRING OPERATIONS

If these four instructions merely acted on the data and incremented SI and
DI automatically, that would be very useful indeed.  But they also have
another talent: they recognize a Rep (Repeat) prefix to perform their magic
a specified number of times.  The number of iterations is specified by the
count held in CX.  Furthermore, the number of repetitions can be made
conditional when comparing and scanning, based on the data encountered.
   If you have, say, 20 bytes of data that need to be copied from one place
to another, you would first set CX to 20 and then use Rep Movsb.  And to
compare 100 words you would load CX with the value 100 and use Rep Cmpsw. 
Stos also accepts a Rep prefix; Rep Stosb places the value in AL into CX
bytes of contiguous memory starting at the address specified in ES:DI.  For
each iteration the 8088 decrements CX, and when it reaches zero the copying
or comparing is complete.
   It is usually not valuable to scan a range of memory unconditionally and
repeatedly.  Therefore Scas is generally used in conjunction with either
Repe (Repeat while Equal) or Repne (Repeat while Not Equal).  Cmps is also
generally used with these conditional prefixes, to avoid wasting time
comparing bytes after a match or a difference was found.  In either case,
however, you load CX with the total number of bytes or words being compared
or scanned.
   Because each iteration decrements CX, you can easily calculate how many
bytes or words were actually processed.  Also, you can test the results of
scanning and comparing using the normal methods such as Je and Jne.  The
following few examples show some ways these commands can be used.

   See if two 40-byte ranges of memory are the same:

        Mov  CX,20              ;comparing 20 words is faster than 40 bytes
        Repe Cmpsb              ;compare them
        Je   Match              ;they matched

   Copy a 2000-element integer array to color screen memory:

        Mov  AX,ArraySeg        ;set DS to the source segment
        Mov  DS,AX              ;through AX
        Mov  SI,ArrayAdr        ;point SI to the array start
        Mov  AX,&HB800          ;the color text screen segment
        Mov  ES,AX              ;assign that to ES
        Mov  DI,0               ;clear DI to point to address 0
        Mov  CX,2000            ;prepare to copy 2000 words
        Rep  Movsw              ;copy the data

   Search a DOS string looking for a terminating zero byte:

        Mov  AX,StringSeg       ;set ES to the string's segment
        Mov  ES,AX              ;(ES cannot be assigned directly)
        Mov  DI,Offset ZString  ;point DI to the string data
        Mov  CX,80              ;search up to 80 bytes
        Mov  AL,0               ;looking for a zero value
        Repne Scasb             ;while ES:[DI] <> AL
        ;-- Now DI points just past the terminating zero byte.
        ;-- The length of the string is (80 - CX + 1).

In the first example, it is assumed that DS:SI and ES:DI already point to
the correct segment and address.  By asking to compare only while the bytes
are equal, the result of the most recent byte comparison can be tested
using Je.  A common mistake many programmers make is comparing the bytes,
and then checking if CX is zero.  The reasoning is that if CX is zero then
they must have all matched; otherwise, the 8088 would have aborted the
comparisons early.  But CX will also be zero if all but the last byte
matched!  Therefore, you must check the zero flag using Je (or Jne if that
is more appropriate).
   Notice in the first example how 20 words are compared, rather than 40
bytes.  Although the net result is the same, word operations are faster on
80286 and later processors when the blocks of memory begin at an even
numbered address.  [Though you can't always know if a variable or block
of memory will begin at an even address, using the word version will be
more efficient at least some of the time.]
   The second and third examples include the code needed to set up the
appropriate segment and address values in DS:SI and ES:DI.  Although this
may seem like a lot of work, you can often do this setup only once and then
use the same registers repeatedly within a routine.  Unfortunately, you
are not allowed to assign a segment register from a constant number.  You
must first assign the number to a conventional register, and then use Mov
to copy it to the segment register.


THE STACK

The primary purpose of the stack is to retain the return address of a
program when a subroutine is called.  This is true not only for assembly
language, but for BASIC as well.  For example, when you use the BASIC
statement GOSUB 1200, BASIC must remember the location in memory of the
next command to execute when the routine returns.  It does this by placing
the address of the next instruction onto the stack *before* it jumps to the
subroutine.  Then when a RETURN instruction is encountered, the address to
return to is available.  The 8088 understands Calls and Returns directly,
and it places and restores the addresses on the stack automatically.
   The stack is not unlike a stack of books on a table, and one of its
great advantages is that you don't need to know where in memory it is
actually located.  Items can be placed onto the stack either manually with
the Push instruction, or automatically by the 8088 processor as part of its
handling of Call and Return statements.  Values are retrieved from the
stack with the Pop command, among other methods.
   One important feature of the stack is when items are added and removed,
the stack pointer register is updated automatically to reflect the next
available stack location.  Thus, a program can access items on the stack
based on the stack pointer, rather than have to know the exact address at
any given time.  This simplifies exchanging information between programs,
since neither has to know how the other operates.  This mechanism also
makes it possible for programs written in one language to communicate with
subroutines written in another.
   Figure 12-2 shows how the stack operates.


           ³
³
           ³
³
ÃÄÄÄÄÄÄÄÄÄÄ´
³  Item 1  ³ <ÄÄ first item that was pushed
ÃÄÄÄÄÄÄÄÄÄÄ´
³  Item 2  ³ <ÄÄ second item that was pushed
ÃÄÄÄÄÄÄÄÄÄÄ´
³  Item 3  ³ <ÄÄ third item that was pushed
ÃÄÄÄÄÄÄÄÄÄÄ´
³  Item 4  ³ <ÄÄ last item that was pushed (SP points here)
ÃÄÄÄÄÄÄÄÄÄÄ´
³   Next   ³ <ÄÄ next available stack location
ÃÄÄÄÄÄÄÄÄÄÄ´
           ³ ÚÄÄ the stack grows downward
³            ³   as new items are added
           ³ ³
³            ³
             \/

Figure 12-2: The organization of the CPU stack.


As each item is pushed onto the stack, it is placed two bytes below the
address held in the stack pointer.  Then the stack pointer is decremented
by two, to show the next available stack location.  Therefore, the stack
grows downward as new items are added.  Note that only full words may be
pushed onto the stack, so all of the items shown here are two bytes in
size.  Also note that the stack pointer holds the address of the last item
that was pushed.


PASSING PARAMETERS
==================

Imagine you have a BASIC subroutine that does something to the variable X. 
The code to assign X, process, and print X might look like this:

   X = 12
   GOSUB 2000     'the routine at line 2000 manipulates X
   PRINT X

In assembly language you could push the value 12 onto the stack, and then
call the subroutine.  The subroutine, expecting the value there would
retrieve it, do its work, and then place the result back again before
returning.  This is similar, but not identical, to how variables are passed
between programs.  Most high-level languages including BASIC pass variables
to subroutines by placing their *addresses* on the stack.  A called routine
can then access the variable via its address, either to read it or to
assign a new value.
   If BASIC let you access the registers directly, it could pass variables
through them, as you saw when telling DOS which of its services to do.  But
BASIC doesn't allow that and moreover, with a limited number of registers,
only a few variables or addresses could be accommodated.  The stack can
hold any number of arguments, by pushing the address of each in turn.
   When you use the BASIC CALL command and pass a variable name to a SUB
or FUNCTION procedure, BASIC first pushes the address of that variable onto
the stack, before jumping to the code being called.  And if more than one
variable is specified, all of the addresses are pushed.  The example below
shows how you might call a routine that returns the current default drive.

   CALL GetDrive(Drive%)

When GetDrive begins, it knows that the stack is holding the address of
Drive%.  The segment and address of the calling BASIC program is also on
the stack; however, GetDrive is not concerned with that.  The important
point is that it can find the address on the stack using the SP (Stack
Pointer) register.  When GetDrive begins the stack is set up as shown in
Figure 12-3.


           ³ ^
³            ³
           ³ ³
³            ÀÄÄ higher addresses
ÃÄÄÄÄÄÄÄÄÄÄ´
³  Drive%  ³ <ÄÄ the address of Drive% that BASIC pushed
ÃÄÄÄÄÄÄÄÄÄÄ´
³ Ret Seg  ³ <ÄÄ BASIC's segment to return to
ÃÄÄÄÄÄÄÄÄÄÄ´
³ Ret Adr  ³ <ÄÄ BASIC's address to return to (SP holds this address)
ÃÄÄÄÄÄÄÄÄÄÄ´
³   Next   ³ <ÄÄ the next available stack location
ÃÄÄÄÄÄÄÄÄÄÄ´
           ³
³
           ³
³

Figure 12-3: The state of the stack within a procedure when one variable
address was passed.


Notice that while GetDrive can get at the address of Drive% through SP,
an extra step is still required to get at the *data* held in Drive%.  Let's
digress for a moment to reconsider the difference between memory addresses
and values.  The assembler command Mov AX,12 puts the value 12 into
register AX.  But suppose you want to put the contents of *memory location*
12 into AX.  You indicate this to the assembler by using brackets, as shown
in the two equivalent examples following.

   Mov AX,[12]    ;load AX from address 12

   Mov BX,12      ;assign BX to the value 12
   Mov AX,[BX]    ;load AX from the address held in BX

The first statement loads AX from the contents of memory at address 12. 
The second first loads BX with the number 12, and then uses BX to identify
that address, moving the contents of that address into AX.  This is an
important distinction, and is illustrated in Figure 12-4 using parallels
to BASIC's PEEK and POKE commands.


     BASIC                      Assembler
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ       ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ
BP = SP                    Mov BP,SP
AL = PEEK(BP + 8)          Mov AL,[BP+8]
SI = 12                    Mov SI,12
POKE SI, 12                Mov Byte Ptr [SI],12

Figure 12-4: Similarities between BASIC's PEEK and POKE, and the assembly
language Mov instruction.


Although you can easily find the address of Drive% by looking at SP, an
extra step is required to get at the actual value.  The example that
follows shows how to do this, except there is one added complication.  You
are not allowed to use SP for addressing, except with 386 and later
microprocessors.  Since you undoubtedly want your programs to work with as
many computers as possible, a different strategy must be used.
   As I mentioned earlier, the BP register is a base register that is meant
for accessing data on the stack.  Therefore, you must first copy SP into
BP, and then use BP to access the stack.  Then you can find where Drive%
is located, and put the current drive number into that address as shown
following:

   Mov  BP,SP      ;put the current stack pointer into BP
   Mov  SI,[BP+4]  ;put the address of Drive% into SI
   Mov  AH,19h     ;tell DOS we want the default drive
   Int  21h        ;call DOS to do it
   Mov  [SI],AL    ;put the answer into Drive%

Notice how brackets are used to indicate the addresses.  You must first
determine the address of Drive%'s address (whew!), before you can put the
value held in AL there.  This is called indirect addressing, because a
register is used to hold the address of the data.  Again, notice how the
8088 accepts addition on the fly when you tell it BP+4.
   The complete working GetDrive routine has two small added complications. 
Beside being unable to use SP for addressing memory, BASIC also requires
you to not change BP either.  The obvious solution, therefore, is to first
save BP on the stack before changing it, and then restore BP later before
returning to BASIC.  The other complication is caused by the very fact that
BASIC put extra information (Drive%'s address) onto the stack.  But neither
is insurmountable, as shown here:

   Push BP          ;save BP before changing it
   Mov  BP,SP       ;put the stack pointer into BP
   Mov  SI,[BP+6]   ;put the address of Drive% into SI
   Mov  AH,19h      ;tell DOS we want default drive
   Int  21h         ;call DOS to do it
   Mov  [SI],AL     ;put the answer into Drive%
   Pop  BP          ;restore BP to its original value
   Ret  2           ;return to BASIC

Notice that here, the address of Drive% is at [BP+6] rather than [BP+4]
as it was in the previous listing.  Since BP was pushed at the start of the
procedure, the stack pointer is two bytes lower when it is subsequently
assigned to BP.  When SI is loaded, [BP] points to the saved version of
itself, [BP+2] and [BP+4] point to the address and segment to return to,
and [BP+6] holds the address of Drive%'s address.  This is illustrated in
Figure 12-5.


           ³
³
           ³
³
ÃÄÄÄÄÄÄÄÄÄÄ´
³  Drive%  ³ <ÄÄ [BP+6] points here
ÃÄÄÄÄÄÄÄÄÄÄ´
³ Ret Seg  ³ <ÄÄ [BP+4] points here
ÃÄÄÄÄÄÄÄÄÄÄ´
³ Ret Adr  ³ <ÄÄ [BP+2] points here
ÃÄÄÄÄÄÄÄÄÄÄ´
³ Saved BP ³ <ÄÄ [BP] points here
ÃÄÄÄÄÄÄÄÄÄÄ´
³   Next   ³ <ÄÄ the next available stack location
ÃÄÄÄÄÄÄÄÄÄÄ´
           ³
³
           ³
³

Figure 12-5: The state of the stack within a procedure after BP has been
pushed.


Normally when a Ret command is encountered, the 8088 pops the last four
bytes from the stack automatically, and returns to the segment and address
contained in those bytes.  But that would leave the 2-byte address of
Drive% still cluttering up the stack.  To avoid this problem the 8088 lets
you specify a *parameter count* as part of the Ret instruction.
   For each variable address that is passed with a CALL from BASIC, you
must add 2 to the Return instruction in your assembler routine.  This is
the number of bytes to remove from the stack, with two being used for each
incoming two-byte address.  Had two variables been passed, the program
would have used Ret 4 instead.  Although it is possible to have the calling
program clean up the stack itself, that would be wasteful.
   For every occurrence of every call that passes parameters, BASIC would
have to include additional code following the call to increment SP
accordingly.  Pushing a parameter's address onto the stack leaves that much
less stack space available.  Therefore, someone has to reverse the process
and either pop the addresses or use Add SP,Num to adjust the stack pointer. 
By having the called routine handle it, that code is needed only once.  In
fact, this is an important deficiency of C, because by design C requires
the caller to clean up the stack.
   [If you've managed to persevere this far you'll be pleased to know that
in practice, the assembler can be told to handle most or all aspects of
stack addressing for you.  This is discussed in the sections that follow.]
   It is also possible to tell BASIC to pass some types of parameters by
value using the BYVAL option in the DECLARE or CALL statements.  When BYVAL
is used, BASIC places the actual value of the variable onto the stack,
rather than its address.  This has several important benefits.  First, the
assembly language routine can use one less instruction.  Second, when a
constant number is passed, BASIC does not need to make a copy of it in
DGROUP.  This copying was described in Chapter 2.
   However, BYVAL is appropriate only when a parameter does not have to be
returned, and only when the values are integers.  If you pass a double
precision parameter using BYVAL, all eight bytes are placed on the stack
using four separate instructions rather than only two needed to pass the
address.  You can also instruct BASIC to pass the full, segmented address
of a parameter, and that is discussed in the section "Dynamic Arrays."


PROCEDURES IN ASSEMBLY LANGUAGE
===============================

All of the discussions so far have focused on how to write the instructions
for an assembly language subroutine.  However, none have described how
these routines are added to a BASIC program, or how a complete procedure
is defined.  Furthermore, the previous examples have not shown a key step
that is needed with all such external routines: establishing the code and
data segments.
   Before an external routine can be linked to a BASIC program you must
establish a public procedure name that LINK can identify.  I will first
show the formal method for defining a procedure and its segments, and then
show the newer, simplified methods that were introduced with MASM version
5.1.  The simplified syntax is used for all of the remaining examples in
this chapter [so don't worry if the setup details for this first example
appear overwhelming].
   The simplest complete subprogram you are likely to encounter is probably
the PrtSc routine that follows--all it does is call Interrupt 5 to send the
contents of the current display screen to LPT1.


Code    Segment Word Public 'Code'
Assume  CS:Code
Public  PrtSc
PrtSc   Proc Far       ;this is equivalent to SUB PrtSc STATIC in BASIC

Int  5                 ;call BIOS interrupt 5
Ret                    ;return to BASIC

PrtSc   Endp           ;this is equivalent to BASIC's END SUB
Code    Ends
End


The first three lines tell the assembler that the code is to be placed in
the segment named Code, and that the name PrtSc is to be made public.  The
fourth line defines the start of a procedure.  The actual code occupies
the next two lines.  Of course, you must tell the assembler where the
procedure ends, which in this case is also the end of the code segment. 
Had several procedures been included within the same block of code, each
procedure would show a start and end point, but there would only be a
single code segment.  The final End statement is needed to tell the
assembler that this is the end of listing, although you might think that
MASM would be smart enough to figure that out by itself!
   Notice that there are two kinds of procedures: Far and Near. External
routines that are called from BASIC are always Far, because BASIC uses what
is called a *medium model*.  This means the procedure does not necessarily
have to be within the same code segment as the main BASIC program.  The
medium model allows the combined programs to exceed the usual 64k limit
when linked to a final .EXE file.
   When BASIC executes a CALL command, it uses a two-word address as the
location to jump to.  One of the words contains a segment, and the other
an address within that segment.  Then when your program finally returns,
the 8088 must know to remove two words from the stack--a segment and an
address--to find where to return to in the calling BASIC program.
   A near procedure, on the other hand, calls an address that is only one
word long.  And when the procedure returns, only a single word is popped
from the stack.  Again, the assembler does the bulk of the dirty work for
you.  You just have to remember to use the word Far.


SIMPLIFIED DIRECTIVES

Fortunately, Microsoft realized what a pain dealing with segments and
procedures and offsets from BP can be, and they enhanced MASM beginning
with version 5.0 to handle these details automatically for you.  Rather
than require the programmer to define the various code and data segments,
all that is needed are a few simple key words.
   The first is .Model Medium, which tells MASM that the procedures that
follow will be Far.  Used in conjunction with .Code and .Data, .Model
Medium tells MASM that any data you define should be placed into a group
named DGROUP.  Adding ,Basic after the .Model directive also declares your
procedures as Public automatically, so BASIC can access them when your
program is linked.
   By using the name DGROUP, the linker automatically gathers all of your
DB and DW data variables, and places them into the same segment that BASIC
uses.  While this has the disadvantage of impinging on BASIC's near data
space, it also means that on entry to the routine the DS register (which
BASIC sets to hold the DGROUP segment) hold the correct segment value for
your variables as well.
   To show the advantages of simplified directives, contrast the earlier
PrtSc with this version that does exactly the same thing:


.Model Medium, Basic
.Code

PrtSc Proc
  Int 5
  Ret
Endp
End


MASM 5.1 introduced additional simplified directives that let you access
incoming parameters by name, rather than as offsets from BP.  All of the
remaining examples in this chapter take advantage of simplified directives,
as the following revised listing for GetDrive illustrates.


;Syntax: CALL GetDrive(Drive%)

.Model Medium, Basic
.Data
   ;-- if variables were needed they would be placed here

.Code
GetDrive Proc, Drive:Word

  Mov  AH,19h      ;tell DOS we want the default drive
  Int  21h         ;call DOS to do it
  Mov  BX,Drive    ;put the address of Drive% into BX
  Cbw              ;clear AH to make a full word
  Mov  [BX],AL     ;then store the answer into Drive%
  Ret              ;return to BASIC

GetDrive Endp      ;indicate the end of the procedure
End                ;and the end of the source file


As you can see, this looks remarkably like a BASIC SUB or FUNCTION
procedure, with the incoming parameter listed by name and type as part of
the procedure declaration.  This greatly simplifies maintaining the code,
especially if you add or remove parameters during development.  If incoming
parameters are defined as shown here using Drive%, code to push BP and then
move SP into BP is added for you automatically.  When you refer to one of
the parameters, the assembler substitutes [BP+##] in the code it generates. 
Note, however, that the Word identifier for Drive refers to the 2-byte size
of its address, and not the fact that Drive% is a 2-byte integer.
   Also notice the new Cbw command, which is used here to clear the AH
register.  Cbw (Convert Byte to Word) expands the byte value held in AL to
a full word in AX.  A full word is needed to ensure that both the high- and
low-byte portions of Drive% are assigned, in case it held a previous value. 
If the value in AL is positive (between 0 and 127), AH is simply cleared
to zero.  And if AL is negative (between -128 and -1 or between 128 and
255), Cbw instead sets all of the bits in AH to be on.  Thus, the sign of
the original number in AL is preserved.
   A complementary statement, Cwd (Convert Word to Double Word), converts
the word in AX to a double-word in DX:AX.  Again, if AX is positive when
considered as a signed number, DX is cleared to zero.  And if AX is
currently negative, DX is set to FFFFh (-1) to preserve the sign.  Cbw and
Cwd are both one-byte instructions, so even with unsigned values they are
always smaller and faster for clearing AH or DX than Mov AH,0 and Mov DX,0
which require two bytes and three bytes respectively.
   Finally, the Ret command that exits the procedure is translated by MASM
to include the correct stack adjustment value, based on the number of
incoming parameters.  If you have multiple exit points from the procedure
(equivalent to EXIT SUB), the exit code will be generated multiple times. 
That is, each occurrence of Ret is replaced with a code sequence to pop the
saved registers, and preform the 3-byte Ret # instruction.  Therefore, you
should always use a single exit point in a routine, and jump to that when
you need to exit from more than one place.


CALLING INTERRUPTS
==================

Chapter 11 explained how interrupts work, and mentioned that only assembly
language can call an interrupt directly.  An assembler program uses the Int
instruction, and this tells the 8088 to look in the interrupt vector table
in low memory to obtain the interrupt procedure's segment and address. 
Then the procedure is called as if it were a conventional subroutine.
   All of the DOS and BIOS services are accessed using interrupts, though
there are so many different services that you also have to pass a service
number to many of them.  Most of the DOS services are accessed through
interrupt 21h.  Where BASIC uses the &H prefix to indicate a hexadecimal
value, assembly language uses a trailing letter H.  If you specify a number
without an H it is assumed by MASM to be regular decimal.  Note that MASM
doesn't care if you use upper- or lowercase letters, and knows that either
means hexadecimal.
   When specifying hexadecimal values to MASM, the first character must
always be a digit.  That is, 1234h is acceptable, but &HB800 must be
entered as 0B800h.  Using B800h will generate a syntax error.


DOS AND BIOS SERVICES

You have already seen how to call the BIOS routine that prints the screen
and the DOS routine that returns the current drive.  Let's continue and see
how to call some of the other useful routines in the BIOS and DOS.
   The next example program, DosVer, shows how to call the DOS service that
returns the DOS version number.  Like many of the assembler routines that
you can use with BASIC, DosVer relies on an existing DOS service to do the
real work.  In this program you will also learn how to push and pop values
on the stack.
   The syntax for DosVer is CALL DosVer(Version%), where Version% returns
with the DOS version number times 100.  That is, if your PC is running DOS
version 3.30, then Version% will be assigned the value 330.  Manipulating
floating point numbers is much more difficult than integers, and the added
complexity is not justified for this routine.
   The DOS service that retrieves the version number returns with two
separate values--the major version number (3 in this case) and the minor
number (30).  These values are returned in AL and AH respectively.  The
strategy here is to first multiply AL by 100, and then add AH.  The last
step is to assign the result to the incoming parameter Version%.
   Unfortunately, when you use AL for multiplication, the value 100 must
be in a register or memory location.  You can't just use MUL AL,100 though
it would sure be nice if you could.  Further, whenever AL is multiplied the
result is placed into the entire AX register.  Therefore, DosVer also uses
BX to temporarily store the original contents of AX before the two are
added together.
   As you already have learned, the only register that can be multiplied
is AX, or its low-byte portion, AL.  MASM knows if you plan to multiply AX
or AL based on the size of the argument.  For example, Mul BX means to
multiply AX by BX and leave the result in DX:AX.  Mul CL instead multiplies
AL by CL and leaves the answer in AX.
   The complete DosVer routine is shown following, and comments explain
each step.


;DOSVER.ASM, retrieves the DOS version number

.Model Medium, Basic
.Code

DOSVer Proc, Version:Word

  Mov  AH,30h      ;service 30h gets the version
  Int  21h         ;call DOS to do it

  Push AX          ;save a copy of the version for later
  Mov  CL,100      ;prepare to multiply AL by 100
  Mul  CL          ;AX is now 300 if running DOS 3.xx

  Pop  BX          ;retrieve the version, but in BX
  Mov  BL,BH       ;put the minor part into BL for adding
  Mov  BH,0        ;clear BH, we don't want it anymore
  Add  AX,BX       ;add the major and minor portions

  Mov  BX,Version  ;get the address for Version%
  Mov  [BX],AX     ;assign Version% from AX
  Ret              ;return to BASIC

DOSVer Endp
End


Notice the extra switch that is done with BH and BL.  AX is saved onto the
stack because multiplying the byte in AL leaves the result as a full word
in AX, thus destroying AH.  When the version is popped into BX, the minor
part is in BH.  But you are not allowed to add registers that are different
sizes (AX and BH).  Further, any number in the high half of a register is
by definition 256 times the value of the same number in a low half. 
Therefore, BH is first copied to BL to reflect its true value.  BH is then
cleared so it won't affect the result, and finally AX and BX are added.
   A better way to save AX and then restore it to BX would be to simply use
Mov BX,AX immediately after the call to Interrupt 21h.  I used Push and Pop
just to show how this is done.  As you can see, it is not necessary to pop
the same register that was pushed.  However, every Push instruction must
always have a corresponding Pop, to keep the stack balanced.  If a register
or other value is on the stack when the final Ret is encountered, that
value will be used as the return address which is of course incorrect.
   Division also acts on AX, or the combination of DX:AX.  When you use
the command Div BL, the 8088 knows you want to divide AX because BL is a
byte-sized argument.  It then leaves the result in AL and the remainder,
if any, is placed into AH.  Similarly, Div DX means that you are dividing
the long integer in DX:AX, because DX is a word.  The result of this
division is assigned to AX, with the remainder in DX.


ACCESSING BASIC STRINGS IN ASSEMBLY LANGUAGE
============================================

As Chapter 2 explained, strings are stored very differently than regular
numeric variables.  BASIC lets you find the address of any variable with
the VARPTR function.  For integer or floating point numbers, the value
VARPTR returns is the address of the actual data.  But for strings, VARPTR
instead returns the address of a string descriptor.
   DOS employs a different method entirely for its strings, using a CHR$(0)
to mark the end.  This is describes separately later in the section "DOS
Strings."


BASIC NEAR STRINGS

A BASIC string descriptor is a table containing information about the
string--that is, its length and address.  In Microsoft compiled BASIC a
string descriptor is comprised of two words of information.  For QuickBASIC
and near strings when using BASIC PDS, the first word contains the length
of the string and the second holds the address of the first character. 
Consider the following BASIC instructions:

   X$ = "Assembler"
   V = VARPTR(X$)

V now holds the starting address of the four-byte descriptor for X$.  For
the sake of argument, let's say that V is now 1234.  Addresses 1234 and
1235 will together contain the length of X$ which is 9, and addresses 1236
and 1237 will contain yet another address--that of the first character in
X$.  You can therefore find the length of X$ using this formula:

   Length = PEEK(V) + 256 * PEEK(V + 1)

And the first character "A" can be located with this:

   Addr = PEEK(V + 2) + 256 * PEEK(V + 3)

You could then print the string on the screen like this:

   FOR C = Addr TO Addr + 8
     PRINT CHR$(PEEK(C));
   NEXT

Therefore, this is a BASIC model for how strings are located by an assembly
language program.  When you call an assembler routine with a string
argument, BASIC first pushes the address of the descriptor onto the stack,
before calling the routine.  The next example is called Upper, because it
capitalizes all of the characters in a string.  Even though BASIC offers
the UCASE$ and LCASE$ functions, these are relatively slow because they
return a copy of the data that has been manipulated.  Upper instead
capitalizes the data in place very quickly.
   The strategy is to first get the descriptor address from the stack. 
Then Upper puts the length into BX and the address of the string data into
SI.  Upper steps through the string starting at the end, decrementing BX
by one for each character.  When BX crosses zero, it is done.  A BASIC
version is shown first, followed by the assembly language equivalent.

Upper in BASIC:

SUB Upper(Work$) STATIC

  '-- load SI with the address of Work$ descriptor
  SI = VARPTR(Work$)

  '-- assign LEN(Work$) to BX
  BX = PEEK(SI) + 256 * PEEK(SI + 1)

  '-- the address of the first character goes in SI
  SI = PEEK(SI + 2) + 256 * PEEK(SI + 3)

More:
  BX = BX - 1                'point to the end of Work$
  IF BX < 0 GOTO Exit        'no more characters to do
  AL = PEEK(SI + BX)         'get the current character
  IF AL < ASC("a") GOTO More 'skip conversion if too low
  IF AL > ASC("z") GOTO More 'or if too high
  AL = AL - 32               'convert to upper case
  POKE SI + BX, AL           'put character back in Work$
  GOTO More                  'go do it all again

Exit:                        'return to caller

END SUB


Upper in assembly language:

Upper Proc, Work:Word

  Mov  SI,Work    ;load SI with Work$'s descriptor address
  Mov  BX,[SI]    ;put LEN(Work$) into BX
  Mov  SI,[SI+2]  ;SI holds address of the first character

Next:
  Dec  BX         ;point to the next prior character
  Js   Exit       ;if sign is negative BX is less than 0
  Mov  AL,[BX+SI] ;put the current character into AL
  Cmp  AL,"a"     ;compare it to ASC("a")
  Jb   More       ;jump if below to More
  Cmp  AL,"z"     ;compare AL to ASC("z")
  Ja   More       ;jump if above to More
  Sub  AL,32      ;convert AL to upper case
  Mov  [BX+SI],AL ;put AL back into Work$
  Jmp  More       ;jump to More

Exit:
  Ret             ;return to BASIC

Upper Endp
End

What's Your Sign?

Notice that for expediency, these routines work backwards from the end of
the string.  There are a number of shortcuts that you can use in assembly
language, and one important one is being able to quickly test the result
of the most recent numeric operation.  If the program worked forward
through the string, it would take three lines of code to advance to the
next character, and also require saving the string length separately:

   Inc  BX           ;point to the next character
   Cmp  BX,Length    ;are we done yet?
   Jne  More         ;no, continue

Notice the use of a new form of conditional jump--Js which stands for *Jump
if Signed*.  Here the code tests the sign of the number in BX, and jumps
if it is negative.  Though I haven't mentioned this yet, a conditional jump
doesn't always have to follow a compare.  Although a comparison will set
the flags in the 8088 that indicate whether a particular condition is true,
so will several other instructions.  Some of these are Add, Sub, Dec, and
Inc, but not Mov.  So instead of having to include an explicit comparison:

   Dec  BX           ;decrement BX
   Cmp  BX,0         ;compare it to zero
   Jl   More         ;jump if less to More

All that is really needed is this:

   Dec  BX
   Js   More

The Dec instruction sets the Sign Flag automatically, just as if a separate
compare had been performed.


Conditional Jump Instructions

Besides Je, Jne, and Js, there are a few other forms of conditional jump
instructions you should understand.  Figure 12-6 lists all of the ones you
are likely to find useful.


Command   Meaning
ÍÍÍÍÍÍÍ   ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ
  Je      Jump if equal
  Jne     Jump if not equal
  Ja      Jump if above (unsigned basis)
  Jna     Jump if not above (unsigned basis)
  Jb      Jump if below (unsigned basis)
  Jnb     Jump if not below (unsigned basis)
  Jg      Jump if greater (signed basis)
  Jng     Jump if not greater (signed basis)
  Jl      Jump if less (signed basis)
  Jnl     Jump if not less (signed basis)
  Jc      Jump if Carry Flag is set
  Jnc     Jump if Carry Flag is clear
  Js      Jump if sign flag is set
  Jns     Jump if sign flag is not set
  Jcxz    Jump if CX is zero

Figure 12-6: The 8088 conditional jump instructions.


You should know that Je and Jne also have an alias command name: Jz and
Jnz.  These stand for *Jump if Zero* and *Jump if Not Zero* respectively,
and they are identical to Je and Jne.  In fact, though I didn't mention
this earlier, the Repe and Repne string repeat prefixes are sometimes
called Repz and Repnz.
   Because Je and Jz cause MASM to generate the identical machine code
bytes, they may be used interchangeably.  In some cases you may want to use
one instead of the other, depending on the logic in your program.  For
example, after comparing two values you would probably use Je or Jne to
branch if they are equal or not equal.  But after testing for a zero or
non-zero value using Or AX,AX you would probably use Jz or Jnz.  This is
really just a matter of semantics, and either version can be used with the
same results.
   Also, please understand that Jnb is not the same as Ja.  Rather, the
case of being Not Below is the same as being Above Or Equal.  In fact, MASM
recognizes Jae (Jump if Above or Equal) to mean the same thing as Jnb. 
Likewise, Jbe (Jump if Below or Equal) is the same as Jna, Jge (Jump if
Greater or Equal) is the same as Jnl, and Jle (Jump if Less or Equal) is
identical to Jng.  Again, which form of these instructions you use will
depend on how you are viewing the data and comparisons.
   Note the special form of conditional jump, Jcxz.  Jcxz stands for Jump
if CX is Zero, and it combines the effects of Cmp CX,0 and Je label into
a single fast instruction.  Jcxz is also commonly used prior to a Loop
instruction.  When you use Loop to perform an operation repeatedly, CX must
be assigned initially to the number of times the loop is to be executed. 
But if CX is zero the loop will execute 65536 times!  Thus, adding Jcxz
Exit avoids this undesirable behavior if zero was passed accidentally.
   Finally, you must be aware that a conditional jump cannot be used to
branch to a label that is more than 128 bytes earlier, or 127 bytes farther
ahead in the code.  A condition jump instruction is only two bytes, with
the first indicating the instruction and the other holding the branch
distance.  If you need to jump to a label farther away than that you must
reverse the sense of the condition, and jump to a near label that skips
over another, unconditional jump:

   Cmp  AX,BX             ;we want to jump to Label: if AX is greater
   Jna  NearLabel         ;so jump to NearLabel if it's NOT greater
   Jmp  Label             ;this goes to Label: which is farther away
   NearLabel:
    .
    .

As used here, the unconditional Jmp instruction can branch to any location
within the current code segment.  There is also a short form of Jmp, which
requires only two bytes of code instead of three.  If you are jumping
backwards in the program and the address is within 128 bytes, MASM uses the
shorter form automatically.  But if the jump is forward, you should specify
Short explicitly: Jmp Short Label.  Some non-Microsoft assemblers do not
require you to specify Short; the newest MASM version 6.x also adjusts its
generated code to avoid the extra wasted byte.


DOS STRINGS

When string information is passed to a DOS routine, for example when giving
a file or directory name, the string must end with a CHR$(0).  In DOS
terminology this is called an ASCIIZ string.  (Do not confuse this with a
CHR$(26) Ctrl-Z which marks the end of a file.)  Unlike BASIC, DOS does
not use string descriptors, so this is the only way DOS can tell when it
has reached the end.  By the same token, when DOS returns a string to a
calling program, it marks the end with a trailing zero byte.
   When passing a string to a DOS service from BASIC you must either
concatenate a CHR$(0) manually, or add extra code within the assembler
routine to copy the name into local storage and add a zero byte to the
copy.  From BASIC you would therefore use something like this:

   CALL Routine(FileName$ + CHR$(0))


BASIC FIXED-LENGTH STRINGS

Fixed-length strings and the string portion of a TYPE variable do not use
a string descriptor, which you might think would require a different
strategy to access them.  But whenever a fixed-length string is used as an
argument to an assembler routine or BASIC subprogram, BASIC first copies
it into a temporary conventional string, and it is the temporary string
that is passed to the routine.  When the routine returns, BASIC copies the
characters back into the original fixed-length string.  Thus, any routine
written in assembly language that expects a descriptor will work correctly,
regardless of the type of string being sent.
   Of course, this copying requires BASIC to generate many extra bytes of
assembler code for each call.  If you do not want BASIC to create a
temporary string copy from one of a fixed-length, you must first define the
string as a TYPE like this:

   TYPE Flen
     S AS STRING * 20
   END TYPE
   DIM FString AS FLen

Though this appears to be the same as defining FString as a string with a
fixed length of 20, there is an important difference: declaring it as a
TYPE tells BASIC not to make a copy.  That is, BASIC does not treat FString
as a string, as long as the ".S" portion that identifies it as a string is
not used.  Here's an example based on the FLen TYPE that was defined above:

   DIM FString AS FLen           'FString is a TYPE variable
   FString.S = "This is a test"  'assign the string portion
   CALL Routine(FString)         'call the routine without .S

Here, the address of the first character in the string is passed to the
routine, as opposed to the address of a temporary string descriptor.  We
have told BASIC to call Routine, and pass it the entire FString TYPE but
without interpreting the .S string component.  This next example does cause
BASIC to create a temporary copy:

   CALL Routine(FString.X)

The short assembly language routine that follows expects the address of a
fixed-length string with a length of 20, as opposed to the address of a
string descriptor.  The routine then copies the characters to the
upper-left corner of a color monitor.


   Push BP         ;access the stack as usual
   Mov  BP,SP
   Mov  SI,[BP+6]  ;SI points to the first character
   Mov  DI,0       ;the first address in screen memory
   Mov  AX,0B800h  ;color monitor segment when in text mode
   Mov  ES,AX      ;move into ES through AX
   Mov  CX,20      ;prepare to copy 20 characters
   Cld             ;clear the direction flag to copy forward

More:
   Movsb           ;copy a byte to screen memory
   Inc  DI         ;skip over the attribute byte
   Loop More       ;loop until done
   Pop  BP         ;restore BP
   Ret  2          ;return to BASIC


Recall that the color monitor segment value of 0B800h must be assigned to
ES through AX, because it is not legal to assign a segment register from
a constant.  Also, notice the way that DI is cleared to zero.  Although Mov
DI,0 indeed moves a zero into DI, this is not the most efficient way to
clear a register.  Any time a numeric value is used in a program (0 in this
case), that much extra space is needed to store the actual value as part
of the instruction.  A preferred method for clearing a register is with
the Xor instruction.  That is, Xor DI,DI gives the same result as Mov DI,0
except it is one byte shorter and slightly faster.
   When Xor is performed on any two values, only those bits that are
different are set to 1.  But since the same register is used here for both
operands, all of the result bits will be cleared to 0.  The code for using
Xor is decidedly less obvious, but you'll see Xor used this way very often
in assembly listings in magazines and books.  Another, equally efficient
way to clear a register is to subtract it from itself using Sub AX,AX.


FAR STRINGS IN BASIC PDS

Accessing near strings in QuickBASIC and BASIC PDS is a relatively simple
task, because both the descriptor and the string data are known to be in
near DGROUP memory.  But BASIC PDS also supports far strings, where the
data may be in a different segment.  The composition of a far string
descriptor was shown in Chapter 2; however, you do not need to manipulate
these descriptors yourself directly.
   BASIC PDS includes two routines--StringLength and StringAddress--that
do the work of locating far strings for you.  Further, because Microsoft
could change the way far strings are organized in the future, it makes the
most sense to use the routines Microsoft supplies.  If the layout of far
string descriptors changes, your program will still work as expected.
   StringLength and StringAddress expect the address of the string
descriptor, and they return the string's length and segmented address
respectively.  Note that while far string data may be in nearly any
segment, the descriptors themselves are always in DGROUP.  Also note that
these routines are not very well-behaved.  In particular, registers you may
be using are changed by the routines.  To solve this problem and also to
let you get all of the information in a single call, I have written the
StringInfo routine.  StringInfo is contained in the FAR$.ASM file on the
accompanying disk.

;from an idea originally by Jay Munro
.Model Medium, Basic
  Extrn StringAddress:Proc ;these are part of PDS
  Extrn StringLength:Proc

.Code
StringInfo Proc Uses SI DI BX ES

  Pushf                    ;save the flags manually

  Push ES                  ;save ES for later
  Push SI                  ;pass incoming descriptor
  Call StringAddress       ;call the PDS routine

  Pop  ES                  ;restore ES for StringLength
  Push AX                  ;save offset and segment
  Push DX                  ;  returned by StringAddress

  Push SI                  ;pass incoming descriptor
  Call StringLength        ;get the length
  Mov  CX,AX               ;copy the length to CX

  Pop  DX                  ;retrieve the saved Segment
  Pop  AX                  ;and the address

  Popf                     ;restore the flags manually
  Ret                      ;restore registers and return

StringInfo Endp
End

StringInfo is called with DS:SI pointing to the string descriptor, and it
returns the length in CX and the address of the string data in DX:AX. 
Although StringInfo could be designed to return the segment in DS or ES,
it is safer to assign the segment registers yourself manually.
   Notice the Uses clause--this tells MASM that the named registers must
be preserved, and generates additional code to push those registers upon
entry to the procedure, and pop them again upon exit.
   Also notice the new Extrn directive at the beginning of the source file. 
These tell the assembler that the stated routines are not in the current
source file.  MASM then places the external name in the object file header,
with instructions to LINK to fill in the address portion of the Call.  Data
must also be declared as external if it is not in the same source file as
the routine being assembled.  When a data item is to be made available to
other modules, you must also have a corresponding Public statement in that
file for the same reason:

   .Model Medium, Basic
   .Data
     Public MyData
     MyData DW 12345
      .
      .


ACCESSING ARRAYS
================

As you have seen, a conventional variable is passed to an assembly language
subroutine by placing its address onto the stack.  If the variable is a
string, then the address passed is that of its descriptor, and the string
data address is read from there.  Accessing array elements is only slightly
more involved, because array elements are always stored in adjacent memory
locations.  Let's look first at integer arrays.
   When BASIC encounters the statement DIM X%(100) in your program, it
allocates a contiguous block of memory 202 bytes long.  (Unless you first
used the statement OPTION BASE 1, dimensioning an array to 100 means 101
elements.)  The first two bytes in this block hold the data for X%(0), the
next two bytes hold X%(1), and so forth.  When you ask VARPTR to find
X%(0), the address it returns is the start of this block of memory.
   The address of subsequent array elements may then be easily computed
from this base address.  But with a dynamic array, the segment that holds
the array may not be the same as the segment where regular variables are
stored.  Also, huge arrays that span more than 64K require extra care when
crossing a 64K segment boundary.
   String arrays are structured in a similar fashion, in that each element
follows the previous one in memory.  For each string array element that is
dimensioned, four bytes are set aside.  These bytes comprise a table of
descriptors which contain the length and address words for each element in
the array.  But the important point is that once you know where one element
or string descriptor is located, it is easy to find all of those that are
adjacent.  Following is a QuickBASIC example that shows how to locate
Array$(15), based on the VARPTR address of Array$(0).


DIM Array$(100)
Array$(15) = "Find me"

Descriptor = VARPTR(Array$(0))
Descriptor = Descriptor + (4 * 15)

Length = PEEK(Descriptor) + 256 * PEEK(Descriptor + 1)
PRINT "Length ="; Length

Addr = PEEK(Descriptor + 2) + 256 * PEEK(Descriptor + 3)
PRINT "String = ";
FOR X = Addr TO Addr + Length - 1
  PRINT CHR$(PEEK(X));
NEXT


DYNAMIC ARRAYS

Most of the routines shown so far manipulated variables that are located
in near memory.  BASIC can store numeric, TYPE, and fixed-length string
arrays in far memory, and additional steps are needed to read from and
write to those arrays.
   When an assembly language routine receives control after a call from
BASIC, it can access your regular variables because they are in the default
data segment.  Most memory accesses assume the data is in the segment held
in the DS register.  For example, the statement Mov [BX],AX assigns the
value in AX to the memory location identified by BX within the segment held
in DS.  Likewise, Sub [DI+10],CX subtracts the value held in CX from the
memory address expressed as DI+10, where that address is again in the
default data segment.
   It is also possible to specify a segment other than the current default. 
One way is with a *segment override* command, like this:

   Mov ES:[BX],AX

Here, the segment held in ES is used instead of DS.  A segment override
adds only one byte of code, so it is quite efficient.  If you plan to
access data in a different segment many times, you can optionally set DS
to that segment.  However, it is mandatory that you reset DS to its
original value before returning to BASIC.  You must also understand that
changing DS means you no longer have direct access to DGROUP anymore.  In
that case you could use the stack segment as an override, since the stack
segment is always the same as the data segment in a BASIC program.  The
next short example shows this in context.

   Push DS                ;save DS
   Mov DS,FarSegment      ;now DS points to your far data
    .                     ;access that far data here
    .
   Mov AX,SS:[Variable]   ;access Variable in DGROUP
    .                     ;access more far data here
   Pop DS                 ;restore DS before returning

When Microsoft introduced QuickBASIC version 2.0, one of the most exciting
new features it offered was support for dynamic numeric arrays.  Unlike
QuickBASIC near strings, string arrays, and non-array variables, these
arrays are always located outside of BASIC's near 64K data segment.  This
means that an assembler routine needs some way to know both the address and
the segment for an array element that is passed to it.
   In general, routines you design that work on an entire array will be
written to expect a particular starting element.  The routine can then
assume that all of the subsequent elements lie before or after it in
memory.  Unfortunately, this does not always work unless you add extra
steps.  If you call an assembly language routine passing one element of a
far-memory dynamic array like this:

   CALL Routine(Array(1))

BASIC makes a copy of the array element into a temporary variable in near
memory, and then passes the address of that copy to the routine.  Thus,
while the routine can still receive an array element's value, it has no way
to determine its true address.  And without the address, there is no way
to get at the rest of the array.
   Since being able to pass an entire array is obviously important, BASIC
supports two options to the CALL command--SEG and BYVAL.  The SEG keyword
indicates that both the address and the segment are to be passed on the
stack, and it also tells BASIC not to make a copy of the array element. 
SEG is used with an array element (or any variable, for that matter) like
this:

   CALL Routine(SEG Array%(1))

You could also send the segment and address manually, like this:

   CALL Routine(BYVAL VARSEG(Array%(1)), BYVAL VARPTR(Array%(1)))

In both cases, BASIC first pushes the segment where the element resides
onto the stack, followed by the element's address within that segment.  By
pushing them in this order the routine can conveniently use either Lds
(Load DS) or Les (Load ES) to get both the segment and address in one
operation:

   Les DI,[BP+6]       ;if using manual stack addressing
or
   Les BX,[StackArg]   ;if using MASM's simplified directives

Les loads four bytes in one operation, placing the lower word at [BP+6]
into the named register (DI in the first example case), and the higher word
at [BP+8] into ES.  Lds works the same, except the higher word is instead
moved into DS.  Once the segment and address are loaded, you can access all
of the array elements:

   Push DS              ;save DS
   Lds  SI,[BP+6]       ;now DS:SI points at first element
   Mov  [SI],AX         ;assign Array%(1) from AX
   Add  SI,2            ;now SI points at the next element
   Mov  [SI],BX         ;assign Array%(2) from BX
   Pop  DS              ;restore DS
    .                   ;continue
    .

If Les were used instead of Lds, then an ES: override would be needed to
assign the elements.  Although you must always preserve the contents of
DS regardless of the version of BASIC, some registers need to be saved only
when using BASIC PDS far strings.  Other registers do not need to be saved
at all.  Figure 12-7 shows which registers must be preserved based on the
version of BASIC.


 QuickBASIC and       BASIC PDS
PDS near strings     far strings
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ      ÍÍÍÍÍÍÍÍÍÍ
      DS                 DS
      SS                 SS
      BP                 BP
      SP                 SP
                         ES
                         SI
                         DI

Figure 12-7: The registers that must be preserved in an assembly language
subroutine.


Besides having to save and restore the registers shown in Figure 12-7, you
must also be sure that the Direction Flag is cleared to forward before
returning to BASIC.  The Direction Flag affects the 8088 string operations,
and is by default set to forward.  You can usually ignore the direction
flag unless you set it to backwards explicitly with the Std instruction. 
In that case, you must use a corresponding Cld command.


Huge Arrays

A huge array is one that spans more than one 64K segment, and as you can
imagine, it requires extra steps to access all of the elements.  That is,
the assembler routine must know which elements are in what segment, and
manually load those segments as needed.  The following code fragment shows
how to walk through all of the elements in a huge integer array, and just
for the sake of the example adds each element to determine the sum of all
of them.
   A simple setup example and call syntax for this routine is as follows:

   REDIM Array&(1 TO 30000)
   FOR X% = 1 TO 30000
     Array&(X%) = X%
   NEXT

   CALL SumArray(SEG Array&(1), 30000, Sum&)
   PRINT "Sum& ="; Sum&


And here's the code for the SumArray routine:

.Model Medium, Basic
.Code

SumArray Proc Uses SI, Array:DWord, NumEls:Word, Sum:Word

  Push DS          ;save DS so we can restore it later
  Push SI          ;PDS far strings require saving SI too

  Xor  AX,AX       ;clear AX and DX which will accumulate
  Mov  DX,AX       ; the total

  Mov  BX,NumEls   ;get the address for NumElements%
  Mov  CX,[BX]     ;read NumElements% before changing DS
  Lds  SI,Array    ;load the address of the first element
  Jcxz Exit        ;exit if NumElements = 0

Do:
  Add  AX,[SI]     ;add the value of the low word
  Adc  DX,[SI+2]   ;and then add the high word
  Add  SI,4        ;point to the next array element

  Or   SI,SI       ;are we beyond a 32k boundary?
  Jns  More        ;no, continue

  Sub  SI,8000h    ;yes, subtract 32k from the address
  Mov  BX,DS       ;copy DS into BX
  Add  BX,800h     ;adjust the segment to compensate
  Mov  DS,BX       ;copy BX back into DS

More:
  Loop Do          ;loop until done

Exit:
  Pop  SI          ;restore SI for BASIC
  Pop  DS          ;restore DS and gain access to Sum&
  Mov  BX,Sum      ;get the DGROUP address for Sum&
  Mov  [BX],AX     ;assign the low word
  Mov  [BX+2],DX   ;and then the high word

  Ret              ;return to BASIC

SumArray Endp
End

The segment bounds checking is handled by the six lines that start with
Or SI,SI.  The idea is to see if the address is beyond 32767, subtract
32768 if it is, and then adjust the segment to compensate.  The most direct
way would have been with Cmp SI,32767 and then Ja More, but Cmp used this
way generates three bytes of code, whereas Or creates only two bytes. 
Since Or sets the Sign flag if the number is negative (above 32767), you
can use it to know when the address adjustment is needed.
   Because it is not legal to add or subtract a segment register, DS is
first copied to BX, 800h is added to that, and the result is then copied
back to DS.  800h is used instead of 8000h (32768) because a new segment
begins every 16 bytes.  [That is, adding 800h to a segment value is the
same as adding 8000h to the address.]
   SumArray also introduces a new instruction:  Adc means Add with Carry,
and it is used to add long integer values that by definition span two
words.  When you add two registers--say, AX and BX--if the result exceeds
65535 only the remainder is saved.  However, the Carry Flag is set to
indicate the overflow condition.  Adc takes this into account, and adds
one extra to its result if the Carry Flag is set.  Therefore, whenever two
long integers are added you'll use Add to combine the lower words, and Adc
for the high words.  Similarly, subtracting long integers requires that you
use Sub to subtract the lower words and then Sbb (Subtract with Borrow) on
the upper words.
   Although the details are hidden from you, when more than one parameter
is passed to an assembly language routine it is the last in the list that
is at [BP+6] on the stack.  The previous argument is at [BP+8], and the one
before that is at [BP+10].  Because the stack grows downward as new items
are pushed onto it, each subsequent item is at a lower address.
   Finally, in a real program this routine would probably be designed as
a function.  Using a function avoids having to pass the Sum& parameter to
receive the returned value, and helps reduce the size of the program.


ASSEMBLER FUNCTIONS
===================

Designing a procedure as a function lets you return information to a
program, but without the need for an extra passed parameter.  Functions are
also useful because BASIC performs any necessary data type conversion
automatically.  For example, if you have written a function that returns
an integer value, you can freely assign the result to a single precision
variable.
   You can also test the result of a function directly using IF, display
it directly with PRINT, or pass it as a parameter to another procedure. 
Some typical examples are shown here:

   SingleVar! = MyFunction%

   IF YourFunction&(Argument%) > 1004 THEN ...

   PRINT HisFunction$(Any$)

Beginning with QuickBASIC version 4.0, functions written in assembly
language may be added to a BASIC program.  To have a function return an
integer value, simply place the value into the AX register before returning
to BASIC.  If the function is to return a long integer, both DX and AX are
used.  In that case, DX holds the higher word and AX holds the lower one.


STRING FUNCTIONS

String functions are only slightly more complicated to design.  A string
function also uses AX as a return value, but in this case AX holds the
address of a string descriptor you have created.  The complete short string
function that follows accepts an integer argument, and returns the string
"False" if the argument is zero or "True" if it is not.

;Syntax:
;DECLARE FUNCTION TrueFalse$(Argument%)
;Answer$ = TrueFalse$(Argument%)

.Model Medium, Basic
.Data
  DescLen DW 0
  DescAdr DW 0
  True    DB "True"
  False   DB "False"

.Code
TrueFalse Proc, Argument:Word

  Mov  DescLen,4            ;assume true
  Mov  DescAdr,Offset True

  Mov  BX,Argument          ;get the address for Argument%
  Cmp  Word Ptr [BX],0      ;is it zero?
  Jne  Exit                 ;no, so we were right
  Inc  DescLen              ;yes, return five characters
  Mov  DescAdr,Offset False ;and the address of "False"

Exit:
  Mov  AX,Offset DescLen    ;show where the descriptor is
  Ret                       ;return to BASIC

TrueFalse Endp
End

Although the function is declared using a dollar sign in the name, the
actual procedure omits that.  [The dollar sign merely tells BASIC what type
of information will be returned.  It is not part of the actual procedure
name.]  TrueFalse begins by defining a string descriptor in the .Data
segment.  It is also possible to store strings and other data in the code
segment and access it with a CS: segment override.  However, data that is
returned as a function must be in DGROUP, and so must the descriptor.
   The first two statements assign the descriptor to an output string
length of four characters, and the address of the message "True".  Then,
the address of Argument is obtained from the stack, and its value is
compared to zero.  If it is not zero, then the descriptor is already
correct and the function can proceed.  Otherwise, the descriptor length is
incremented to reflect the correct length, and the address portion is
reassigned to show where the string "False" begins in memory.  In either
case, the final steps are to load AX with the address of the descriptor,
and then return to BASIC.
   MASM also lets you access data using simple arithmetic.  For example,
the descriptor could have been defined as a single pair of words with one
name, and the second word could be accessed based on the address of the
first one like this:

   .Data
     Descriptor DW 0, 0
     True       DB "True"
     False      DB "False"

   .Code
      .
      .
     Inc  Descriptor
     Mov  Descriptor+2,Offset False
      .
      .


Far String Functions

Far string functions require more work to write than near string functions,
because of the added overhead needed to support far strings.  Fortunately,
BASIC includes routines that simplify the task for you.  Actually, the
routines to create and assign strings have always been included; it's just
that Microsoft never documented how to do it before BASIC 7.0.  Later in
this chapter I'll show code to create strings that works with all versions
of BASIC 4.0 or later.
   The StringAssign routine expects six arguments on the stack, for the
segment, address, and length of both the source and destination strings. 
StringAssign can assign from or to any combination of fixed- and variable-
length strings.  If the length argument for either string is zero, then
StringAssign knows that the address is that of a descriptor.  Otherwise,
the address is of the data in a fixed-length string.
   Because of the added overhead of obtaining values and pushing them on
the stack, I have created a short wrapper program that does this for you. 
MakeString accepts the same arguments as StringAssign, but they are passed
using registers rather than on the stack.  Of course, calling one routine
that in turn calls another takes additional time.  But the savings in code
size when MakeString is called repeatedly will overshadow the very slight
additional delay.
   MakeString is called with DX:AX holding the segmented address of the
source string, and CX holding its fixed length.  If the source is a
conventional string, CX is set to zero to indicate that.  The destination
address is identified with DS:DI, using BX to hold the length.  Again, BX
holds zero if the destination is not a fixed-length string.


;from an idea originally by Jay Munro
.Model Medium, Basic
  Extrn STRINGASSIGN:Proc

.Code
MakeString Proc Uses DS

  Push DX           ;push the segment of the source string
  Push AX           ;push the address of the source string
  Push CX           ;push the string length
  Push DS           ;push the segment of the destination
  Push DI           ;push the address of the destination
  Push BX           ;push the destination length

  Call STRINGASSIGN ;call BASIC to assign the string
  Ret

MakeString Endp
End


Now, with the assistance of MakeString, TrueFalse$ can be easily modified
to work with BASIC 7 far strings:

.Model Medium, Basic
  Extrn MakeString:Proc        ;this is in FAR$.ASM

.Data
  Descriptor DW 0, 0           ;the output string descriptor
  True       DB "True"
  False      DB "False"

.Code
TrueFalse Proc Uses ES DS SI DI, Argument:Word

  Mov  CX,4             ;assume true
  Mov  AX,Offset True

  Mov  BX,Argument      ;get the address for Argument%
  Cmp  Word Ptr [BX],0  ;is it zero?
  Jne  @F               ;no, so we were right

  Inc  CX               ;yes, assign five characters
  Mov  AX,Offset False  ;and use the address of "False"

@@:
  Mov  DX,DS                ;assign the segment and address
  Mov  DI,Offset Descriptor ;  of the destination descriptor
  Xor  BX,BX                ;assign to a descriptor
  Call MakeString           ;let MakeString do the work

  Mov  AX,DI            ;AX = address of output descriptor
  Ret                   ;return to BASIC

TrueFalse Endp
End

Notice the introduction of the new at-symbol (@) assembler directive.  The
at-symbol and double at-symbol label are quite useful, because they let you
avoid having to create unique label names each time you specify the target
of a jump.  As with BASIC, creating many different label names is a
nuisance, and also impinges on the assembler's working memory.  When a
label is defined using @@: as a name, you can jump forward to it using @F
or backwards using @B.  Multiple @@: labels may be used in the same
program, and @F and @B always branch to the nearest one in the stated
direction.


FLOATING POINT FUNCTIONS

Single and double precision functions are handled in yet another manner. 
Although a single precision value could be returned in the DX:AX register
combination, a double precision result would need four registers, which is
impractical.  Further, a floating point number is most useful to BASIC if
it is stored in a memory location, rather than in registers.
   When BASIC invokes a floating point function it adds an extra, dummy
parameter to the end of the list of arguments you pass.  If no parameters
are being used, it creates one.  This parameter is the address into which
your routine is to place the outgoing result.  Because of this added
parameter, it is essential that you account for it when returning to BASIC. 
Thus, a function without arguments must use Ret 2, a function with one
argument needs Ret 4, and so forth.  Since we're using MASM's simplified
directives, all that is needed is to create an extra parameter name.
   The short double precision function that follows squares a double
precision number much faster than using Value# ^ 2, and also shows how to
perform simple floating point math using assembly language.  You will
declare and invoke Square like this:

   DECLARE FUNCTION Square#(Variable#)
   Result = Square#(Variable#)

;SQUARE.ASM, squares a double precision number
;
;WARNING: This file must be assembled using /e (emulator).

.Model Medium, Basic
.Code
.8087                   ;allow 8087 instructions

Square Proc, InValue:Word, OutValue:Word

  Mov  BX,InValue       ;get the address for InValue
  FLd  QWord Ptr [BX]   ;load InValue onto the 8087 stack
  FMul QWord Ptr [BX]   ;multiply InValue by itself

  Mov  BX,OutValue      ;get the address for OutValue
  FStp QWord Ptr [BX]   ;store the result there
  FWait                 ;wait for the 8087 to finish

  Mov  AX,BX            ;return DX:AX holding the full
  Mov  DX,DS            ;  address of the output value
  Ret                   ;return to BASIC

Square Endp
End

This Square function illustrates several important points.  The first is
the use of MASM's /e switch, which lets an assembly language routine share
BASIC's floating point emulator.  When a BASIC program begins, it looks to
see if an 8087 coprocessor is installed in the host PC.  If so, it uses one
set of library routines; otherwise it uses another.
   The library routines that use an 8087 simply modify the caller's code
to change the floating point interrupts that BASIC generates into actual
8087 instructions.  It then returns to the instruction it just created and
executes it.  Although this adds to the time needed to perform a floating
point operation, the code is patched only once.  Thus, statements within
a FOR or DO loop operate very quickly after the first iteration.  This is
very much like the method used by the BRUN library described in Chapter 1.
   When no coprocessor is detected, the floating point interrupts that
BASIC generates are used to invoke routines in BASIC's floating point
software emulator.  As its name implies, an emulator imitates the behavior
of a coprocessor using assembly language commands.  A coprocessor can
perform a variety of floating point operations, including addition,
multiplication, and rounding, as well as some transcendental functions such
as logarithms and arctangents.
   When you use the /e switch, MASM adds extra information to the object
file header that tells LINK where to patch your 8087 instructions.  LINK
can then change your code to the equivalent floating point interrupts,
similar to the way BASIC patches its own code to change the interrupts to
8087 instructions.  Therefore, when you write floating point code that will
be called from BASIC, your routine can tie into BASIC's emulator, and use
it automatically if no coprocessor is installed.
   Also, notice the .8087 directive which tells MASM not to issue an error
message when it sees those instructions.  Other, similar directives are
.80287 and .80387, and also .80286 and .80386.  These directives inform
MASM that you are intentionally using advanced commands that require these
processors, and have not made a typing error.
   The actual body of the Square function is fairly simple.  First, the
address of the incoming value is retrieved from the system stack, and then
the data at that address is loaded onto the coprocessor's stack using the
FLd (Floating point Load) instruction.  Since this is a double precision
value, QWord Ptr (Quad Word Pointer) is needed to indicate the size of the
data.  Had the incoming value been single precision, DWord Ptr (Double
Word Pointer) would be used instead.  One important feature of an 8087 or
software emulator is that a number may be converted from one numeric format
to another simply by loading it as one data type, and then saving it as
another.
   The next instruction, FMul (Floating point Multiply), multiplies the
value currently on the 8087 stack by the same address.  Since the original
value is still present, there's no need to make a new copy.  Next, the
destination address is placed into BX, and the result now on the 8087 stack
is stored there.  The trailing letter p in the FStp instruction specifies
that the value loaded earlier is to be popped from the coprocessor stack.
   A complete discussion of 8087 instructions and how the coprocessor stack
operates goes beyond what I can hope to cover here.  When in doubt about
what instruction is needed, I suggest that you code a similar sample in
BASIC, and then examine the code BASIC generates using CodeView.  There are
also several books that focus on writing floating point instructions in
assembly language.
   The last 8087 instruction is FWait, and it tells the 8088 to wait until
the coprocessor has finished, before continuing.  Because an 8087 is a true
coprocessor, it operates independently of the main 8088 CPU.  Once a value
is loaded and the 8087 is instructed to perform an operation, the 8087
returns immediately to the program that issued the instruction and
continues to process the numbers in the background.  If Square exited
immediately and BASIC read the returned value, there's a good chance that
the 8087 did not finish and the value has not yet been stored!  In that
case, whatever happened to be in memory at that time would be the value
that BASIC uses, which is obviously incorrect.
   Experienced 8087 programers know how long the various coprocessor
instructions take to complete, and with careful planning the number of
FWait commands can be kept to a minimum.  However, the code that BASIC
generates always finishes with an FWait.  Of course, there is no need to
wait when the emulator is in use.  In fact, an FWait is patched by BASIC
to do nothing (Mov AX,AX), rather than waste time invoking an empty
interrupt handler repeatedly.
   As shown, Square can be added to a Quick Library for use with either
QuickBASIC or BASIC PDS.  Unfortunately, the information link needs to
patch 8087 instructions is available only with BASIC PDS.  Therefore, the
following file is included in the libraries on the accompanying disk, to
supply the external data that LINK requires.


;FIXUPS.ASM, deciphered by Paul Passarelli

  FIARQQ  Equ 0FE32h
  FJARQQ  Equ 04000h
  FICRQQ  Equ 00E32h
  FJCRQQ  Equ 0C000h
  FIDRQQ  Equ 05C32h
  FIERQQ  Equ 01632h
  FISRQQ  Equ 00632h
  FJSRQQ  Equ 08000h
  FIWRQQ  Equ 0A23Dh

  Public  FIARQQ
  Public  FJARQQ
  Public  FICRQQ
  Public  FJCRQQ
  Public  FIDRQQ
  Public  FIERQQ
  Public  FISRQQ
  Public  FJSRQQ
  Public  FIWRQQ
End


These values are added to the floating point instruction bytes during the
linking process, and the addition converts those statements into equivalent
BASIC floating point interrupt commands.  For example, the 8087 statement
Fld DWord Ptr [1234h] is represented in memory as the following series of
Hexadecimal bytes:

   9B D9 06 34 12

After LINK adds the value FIDRQQ (5C32h) to the first two bytes of this
command the result is:

   CD 35 06 34 12

And when disassembled back to assembler mnemonics, the CD35h displays as
Int 35h.  The three bytes that follow are always left unchanged, and they
specify the type of operation--DWord Ptr on a memory location--and the
address of that location.


Floating Point Comparisons

At the core of any sorting or searching routine is an appropriate
comparison function.  Previous chapters showed how to compare string data,
and as you can imagine comparing floating point values is much more
complex.  But now that you know how to tap into BASIC's floating point
routines it is almost trivial to effect a floating point comparison.  The
routines that follow let you compare either single- or double precision
values, by passing them as arguments.

;COMPAREFP.ASM, compares floating point values

;WARNING: This file must be assembled using /e (emulator)

.Model Medium, Basic
  Extrn B$FCMP:Proc   ;BASIC's FP compare routine

.8087                 ;allow coprocessor instructions
.Code

CompareSP Proc, Var1:Word, Var2:Word

  Mov  BX,Var2        ;get the address of Var1
  Fld  DWord Ptr [BX] ;load it onto the 8087 stack
  Mov  BX,Var1        ;same for Var2
  Fld  DWord Ptr [BX]
  FWait               ;wait until the 8087 says it's okay
  Call B$FCMP         ;compare the values, (and pop both)

  Mov  AX,0           ;assume they're the same
  Je   Exit           ;we were right
  Mov  AL,1           ;assume Var1 is greater
  Ja   Exit           ;we were right
  Dec  AX             ;Var1 must be less than Var2
  Dec  AX             ;decrement AX to -1

Exit:
  Ret                 ;return to BASIC

CompareSP Endp



CompareDP Proc, Var1:Word, Var2:Word

  Mov  BX,Var2        ;as above
  Fld  QWord Ptr [BX]
  Mov  BX,Var1
  Fld  QWord Ptr [BX]
  FWait
  Call B$FCMP

  Mov  AX,0
  Je   Exit
  Mov  AL,1
  Ja   Exit
  Dec  AX
  Dec  AX

Exit:
  Ret

CompareDP Endp
End

Like the Compare3 function shown in Chapter 8, CompareSP and CompareDP are
integer functions that return -1, 0, or 1 to indicate if the first value
is less than, equal to, or greater than the second.  Therefore, to use
these from BASIC you would invoke them like this:

   IF CompareSP%(Value1!, Value2!) = -1 THEN
     'the first value is smaller than the second
   END IF

And to test if the first is equal to or greater than the second you would
instead do this:

   IF CompareSP%(Value1!, Value2!) >= 0 THEN
     'the first value is equal or greater
   END IF

You can also use these functions from assembly language.  But if you do
this, I suggest a simple modification.  A comparison routine meant to be
called from another assembler routine would not generally return the result
in the registers.  Rather, it would leave the flags set appropriately for
a subsequent Ja or Jne branch.
   Fortunately, BASIC's B$FCMP routine already does this.  Therefore, you
will make a copy of the COMPAREF.ASM source file, and delete the six lines
between the call to B$FCMP and the Ret instruction.  You can also remove
the Exit: label if you like, although its presence causes no harm.  Of
course, the code itself is so simple that the best solution may be to
simply duplicate the same instructions inline in your routine.


EXPLOITING MASM'S FEATURES
==========================

Each example I have shown so far introduced another useful MASM feature. 
For example, you learned how MASM lets you establish data memory with an
initial value, so you don't have to assign it explicitly.  But there are
several other features you should know about as well.  One is conditional
assembly.


CONDITIONAL ASSEMBLY

With conditional assembly you can specify that only certain portions of a
file are to be assembled.  This makes it easier to maintain two different
versions of a routine, for example one for near strings and one for far
strings.  If you had to create two separate copies of the source file, any
improvements or bug fixes that you add would have to be done twice.
   There are two ways that a section of code can be optionally included or
excluded.  One is to define a constant at the beginning of the source file,
and then test that constant using a form of IF and ELSE test.  Like BASIC,
MASM lets you define constant values using meaningful names.  The problem
with this method--albeit a minor one--is that you must alter the code prior
to assembling each version.  The example that follows shows how this kind
of conditional assembly is employed.

   MyConst = 1
    .
    .
   IF MyConst
          ;do whatever you want here
   ELSE   ;the ELSE is optional
          ;do whatever else you want here
   ENDIF
    .
    .

The idea is that if you want the code that follows the IF test to be
assembled, you would use a non-zero value for MyConst.  If you wanted to
create an alternate version using the code within the optional ELSE block,
you would change the value to be zero.
   You can also use IFE (If Equal to zero) to test if a constant is zero. 
And this brings up another interesting MASM feature.  There are actually
two types of constants you can define.  The constant MyConst shown above
is called a *redefinable* constant, because you can actually change its
value during the course of a program.  The other type of constant is
defined using the Equ (Equate) directive, and may not be changed:

   YourConst Equ 100

Redefinable constants are often used in repeating macros, and macros are
discussed later in this section.
   The other way to tell MASM that it is to assemble just a portion of the
file is with IFDEF.  IFDEF (If Defined) tests if a constant has been
defined at all, as apposed to comparing for a specific value.  The value
of this approach is that you can define a constant on the MASM command line
when you run it.  The first example below tells MASM to assemble the code
within the IFDEF block, and the second tells it to not to.

   C:\ASM\> masm program /def myconst ;

   C:\ASM\> masm program ;

Here's the portion of the routine that is being assembled conditionally:

   IFDEF MyConst
     ;do something optional here
   ENDIF

Likewise, IFNDEF (If Not Defined) tests if a constant has not been defined
when reversing the logic is more sensible to you.  MASM includes a great
number of such conditional tests, and only by reading that section of the
MASM manual will you become familiar with those that are the most useful.


COMMENT BLOCKS

Another useful MASM feature that I personally would love to see added to
BASIC is multi-line comment blocks.  The Comment command accepts any single
character you choose as a delimiter, and considers everything thereafter
to be comments until the same character is encountered.  Many programmers
use a vertical bar, because it is not a common character:

   Comment |
   This program is intended to blah blah blah, and it works
   by loading AX with blah blah blah.
   |

Besides avoiding the need to place an explicit semicolon on each comment
line, this also makes it easy to remark out large sections of code while
you are debugging a routine.


QUOTED STRINGS

Yet another useful feature is MASM's willingness to use either single or
double quotes to indicate ASCII text and individual characters.  In BASIC,
if you want to specify a double quote you must use CHR$(34)--it simply is
not legal to use """, where the quote in the middle is the character being
defined.  [With the introduction of VB/DOS triple quotes may now be used
for this purpose.]  If you need to define a double quote simply surround
it with apostrophes like this:

   SomeData DB '"'
   Mov  AH, '"'

Or you can place a single quote within double quotes like this:

   Add DL, "'"

MASM can use either convention as needed, which is a feature I personally
like a lot.


LENGTH AND ADDRESS SELF-CALCULATION

Whenever MASM sees the dollar sign ($) operator it interprets that to mean
*here*, or the current address.  This can be used both for data and code,
though it is more common with data as the example below illustrates.

   .Data
     Descriptor DW MsgLen, Address
     Message    DB "This is a message."
     Address =  Offset Message
     MsgLen  =  $ - Address

The expression $ - Address tells the assembler to take the current data
address, and subtract from that the address where Message begins.  This is
a very powerful concept because it frees the programmer from many tedious
calculations.  In particular, if the string contents are changed at a later
time, the new length is recalculated by MASM automatically.


DEFINING DATA STRUCTURES

To assist you in manipulating data structures, MASM offers the Struc
directive.  This is identical to BASIC's TYPE statement, whereby you define
the organization of a collection of related data items.  The example below
shows how to define a custom data structure using BASIC, followed by an
equivalent MASM Struc definition.


BASIC:

   TYPE MyType
     LastName  AS STRING * 15
     FirstName AS STRING * 12
     ZipCode   AS STRING * 5
     RecordPtr AS LONG
   END TYPE
   DIM MyVar AS MyType


MASM:

   Struc MyStruc
     LastName  DB 15 Dup (?)
     FirstName DB 12 Dup (?)
     ZipCode   DB  5 Dup (?)
     RecordPtr DD  ?
   MyStruc Ends
   MyVar DB Size MyStruc Dup (?)


Like BASIC, defining a structure merely establishes the number and type
of data items that will be stored; memory is not actually set aside until
you do that manually.  In BASIC, you must use DIM to establish the memory
that will hold the TYPE variable.  In assembly language you instead use DB
in conjunction with the Size directive, to set aside the appropriate number
of bytes.
   Each component of the Structure is defined using an identifying name and
a corresponding data type.  Then, whenever a structure member is referenced
in your assembler routine, MASM replaces it with a number that shows how
far into the structure that member is located.  MASM uses the same syntax
as BASIC, with a period between the data name and the structure identifier. 
Here are a few examples:

   Mov  AL,[BX+MyVar.LastName]   ;same as Mov AL,[BX+15]
   Les  DI,[MyVar.RecordPtr]     ;loads ES:DI from RecordPtr


MINIMIZING DGROUP USAGE
=======================

In many cases you will store the variables your routines need in DGROUP
using the .Data directive.  As with static subprograms and functions in
BASIC, this data will not change between subroutine calls.  But this also
means that these variables are combined into the same 64k segment that is
shared with BASIC.  When there are many variables or many different
routines each with their own variables, this can significantly reduce the
amount of near memory available to BASIC.  There are two effective
solutions to this problem.


LOCAL VARIABLES

One way to reduce the DGROUP impact of many variables is to place some of
them onto the system stack.  MASM lets you do this automatically with its
Local directive, or you can do it manually by subtracting the requisite
number of bytes from SP.  Of course, there is only so much room on the
stack, so this approach is most useful when there are many routines and
each has less than 1K or so of data.  Stack variables are also useful when
programming for OS/2 or Windows.  These operating systems require that all
of your procedures be reentrant so static variables cannot be used.
   The example below creates room for fifty words of local storage on the
stack, and then clears the variables to zero.

   Routine Proc Uses ES DI, Param1:Word, Param2:Word
     Sub  SP,100         ;50 words = 100 bytes
     Push SS             ;assign ES from SS
     Pop  ES
     Mov  DI,SP          ;point DI to the start of storage
     Xor  AX,AX          ;fill with zeros
     Mov  CX,50          ;clear fifty words
     Rep  Stosw          ;store AX CX times at ES:[DI]
      .                  ;the routine continues
      .
     Add  SP,100         ;restore SP to what it had been
     Ret                 ;return to BASIC
   Routine Endp

MASM can also do this automatically for you using Local like this:

   Routine Proc Uses ES DI, Param1:Word, Param2:Word
     Local Buffer [100]:Byte
     Lea  DI,Buffer      ;clear the stack variables here
      .                  ;the routine continues
      .
     Ret                 ;return to BASIC
   Routine Endp

As you can see, Local lets you refer to the start of the local stack data
area by name.  Notice how Lea is required here, because the address of
Buffer is expressed as an offset from BP.  That is, MASM translates the
Lea instruction to Lea DI,[BP-100].  You cannot use Mov DI,Offset Buffer
because Buffer's address (which is based on the current setting of the
stack pointer) is not known when the routine is assembled or linked.
   In this case only one local block is defined, so you could also use Mov
DI,SP to set DI to point to the start of the data.  It is not strictly
necessary to clear the stack space before using it, but it is important to
understand that whatever junk happened to be in memory at that time will
still be there after using Local.
   It is also important to be aware of a number of bugs with the Local
directive.  I have found that limiting the use of Local to a single set of
data as shown here is safe with all MASM versions through 5.1.  Using
multiple Local directives defined with data structures can result in the
wrong part of the stack being written to when a structure member is
accessed by name.


STORING DATA IN THE CODE SEGMENT

Another time-honored technique for conserving DGROUP memory is to place
selected variables into the code segment.  In most cases storing data for
a routine in the code segment will make your programs slightly larger and
slower, because of the need for an added CS: segment override.  But when
large amounts of data must be accommodated, this can be very valuable
indeed.  One advantage to using the code segment is that you can establish
initial values for the data, which is not possible when using the stack.
   As an example of this technique, I have written a string function called
Message$ that stores a series of messages in the code segment.  In this
case only a single CS: segment override is needed, so the impact of using
the code segment for data is insignificant.  Message$ is designed to be
declared and invoked as follows:

   DECLARE FUNCTION Message$(BYVAL MsgNumber%)
   Result$ = Message$(AnyInt%)

Message$ is table driven, which makes it simple to modify the routine to
change or add messages without having to make any changes to the function's
structure.  As shown here, Message$ is designed to return the name of a
weekday, given a value between one and seven.  You can easily modify it to
return other strings of nearly any length.

.Model Medium, Basic
  Extrn B$ASSN:Proc         ;BASIC's assignment routine

.Data
  Descriptor DD 0           ;the output string descriptor
  Null$      DD 0           ;use this to return a null
                            ;  (needed for BASIC PDS only,
.Code                       ;  but okay with QuickBASIC)

Message Proc Uses SI, MsgNumber:Word

  Mov  SI,Offset Messages   ;point to start of messages
  Xor  AX,AX                ;assume an invalid value

  Mov  CX,MsgNumber         ;load the message number
  Cmp  CX,NumMsg            ;does this message exist?
  Ja   Null                 ;no, return a null string
  Jcxz Null                 ;ditto if they pass a zero

Do:                         ;walk through the messages
  Lods Word Ptr CS:0        ;load and skip over this message's length
  Dec  CX                   ;show that we read another
  Jz   Done                 ;this is the one we want

  Add  SI,AX                ;skip over the message text
  Jmp  Short Do             ;continue until we're there

Done:
  Or   AX,AX                ;are we returning a null?
  Jz   Null                 ;yes, handle that differently
  Push CS                   ;no, pass the source segment

Done2:
  Push SI                   ;and the source address
  Push AX                   ;and the source length

  Push DS                   ;pass the destination segment
  Mov  AX,Offset Descriptor ;and the destination address
  Push AX
  Xor  AX,AX                ;0 means assign a descriptor
  Push AX                   ;pass that as well

  Call B$ASSN               ;let B$ASSN do the dirty work
  Mov  AX,Offset Descriptor ;show where the output is
  Ret                       ;return to BASIC

Null:
  Push DS                   ;pass the address of Null$
  Mov  SI,Offset Null$
  Jmp  Short Done2

Message Endp


;----- DefMsg macro that defines messages
DefMsg Macro Message
  LOCAL MsgStart, MsgEnd    ;;local address labels
  NumMsg = NumMsg + 1       ;;show we made another one
  IFB <Message>             ;;if no text is defined
    DW 0                    ;;just create an empty zero
  ELSE                      ;;else create the message
    DW MsgEnd - MsgStart    ;;first write the length
    MsgStart:               ;;identify the starting address
      DB Message            ;;define the message text
    MsgEnd Label Byte       ;;this marks the end
  ENDIF
Endm


Messages Label Byte         ;the messages begin here
NumMsg = 0                  ;tracks number of messages
                            ;DO NOT MOVE this constant
DefMsg "Sunday"
DefMsg "Monday"
DefMsg "Tuesday"
DefMsg "Wednesday"
DefMsg "Thursday"
DefMsg "Friday"
DefMsg "Saturday"
End

After declaring BASIC's B$ASSN routine as being external, Message$ defines
two string descriptors in the Data segment.  The first is used for the
function output when returning a normal message, and the second is used
only when returning a null string.  In truth, the need for a separate
output descriptor and the slight added steps to detect the special case of
a null output string is needed only with BASIC PDS far strings.  And this
brings up an important point.
   It is impossible to write one assembly language subroutine that can work
with both QuickBASIC and BASIC PDS far strings using the normal, documented
methods.  To create a string function for use with QuickBASIC and PDS near
strings, you define and fill in a string descriptor in DGROUP, and assign
its address in AX before returning to BASIC.  And to return a far string
as a function for PDS requires calling the internal STRINGASSIGN routine
that Microsoft provides with PDS.  STRINGASSIGN works with both near and
far strings in PDS, but is not available in QuickBASIC.
   The trick is to use the *undocumented* name B$ASSN, which is really the
same thing as STRINGASSIGN.  The big difference, though, is that B$ASSN is
available in all versions of BASIC 4.0 and later.  When near strings are
used the B$ASSN routine is extracted from the near strings library.  When
linking with far strings a different version is used, extracted by LINK
from the far strings library.  This is a powerful concept to be sure, and
one we will use again for other examples later on in this chapter.
   Message$ begins by loading SI with the starting address of a table of
messages.  These messages are located at the end of the source file in the
code segment, and each is preceded with the length of the text.  Although
it may not be obvious from looking at the source listing, the message data
is actually structured like this:

   DW 6
   DB "Sunday"
   DW 6
   DB "Monday"
    .
    .

Next, AX is cleared to zero just in case the incoming string number is
illegal.  Later in the program AX holds the length of the output string;
clearing it here simply makes the program's logic more direct.
   CX is then loaded with the message number the caller asked for.  If CX
is either higher than the available number of messages or zero, the program
jumps to the code that returns a null string.  Otherwise, a small loop is
entered that walks through each message, decrementing CX as it goes.  When
CX reaches zero, SI is pointing at the correct message and AX is holding
its length.  Otherwise, the current length is added to SI, thus skipping
over that data.
   Notice the unusual form of the Lodsw statement, to allow it to work with
a CS: override.  MASM has a number of quirks that are less than intuitive,
and this is but one of them.  Normally you would use either Lodsb or Lodsw,
to indicate loading either a byte into AL or a word into AX.  But when you
use a segment override MASM requires omitting the "b" or "w" Lods suffix,
and you must state Byte Ptr or Word Ptr explicitly.  Then, a dummy argument
must be placed after the override colon.


MASM MACROS

The last new feature this listing introduces is the use of macros.  The
most basic use of MASM macros is to define a block of code once, and then
repeat it multiple times with a single statement.  This is not unlike
keyboard macro programs such as Borland's SuperKey, that let you assign a
string of text to a single key.  For example, you could press Alt-S and
SuperKey will type "Very truly yours", five Enter keys, and then your name.
   MASM macros also offer many other interesting and useful capabilities,
including the ability to accept arguments.  [I should mention that the main
point of the DefMsg macro is to make this function easy to modify, so you
can create other, similar string functions from this same routine.]  Before
attempting to explain the DefMsg (Define Message) macro I designed for use
with Message$, let's consider some macro basics.
   Say, for example, you find that a particular routine needs to push the
same five registers many times during the course of a procedure.  To
simplify this task you could define a macro--perhaps named PushRegs--that
performs the code sequence for you.  Such a macro definition would look
like this:

   PushRegs Macro
     Push AX
     Push BX
     Push SI
     Push DS
     Push ES
   PushRegs Endm

Now, each time you want to execute this series of instructions you would
simply use the command PushRegs.  Please understand that a macro is not the
same as a called subroutine.  The assembler still places each Push command
in sequence into your source code each time the macro is invoked.  But a
simple macro like this can reduce the amount of typing you must do, and
minimize errors such as pushing registers in the wrong order.  And in some
cases Macros also make your code easier to read.
   As I mentioned, a MASM macro can accept arguments, and it can even be
designed to accept a varying number of them.  If you need to push three
registers but which ones may change, you would define PushRegs like this:

   PushRegs Macro Reg1, Reg2, Reg3
     Push Reg1
     Push Reg2
     Push Reg3
   Endm

Then to push AX, SI, and DI you would invoke PushRegs as follows:

   PushRegs AX, SI, DI

Of course, a corresponding PopRegs macro would be defined similarly.  Once
a macro has been defined you can pass any legal argument to it.  For
example, you could also use this:

   PushRegs AX, Word Ptr [BP-20], IntVar

Here, you are pushing AX, the word 20 bytes below where BP points to on
the stack, and the integer variable named IntVar.
   A useful enhancement to this macro would let you pass it a varying
number of parameters.  The PushM macro that follows accepts any number of
arguments (up to eight), and pushes each in sequence.

   PushM Macro A,B,C,D,E,F,G,H     ;;add more place-holders to suit
     IRP CurArg, <A,B,C,D,E,F,G,H> ;;repeat for each argument
       IFNB <CurArg>               ;;if this arg is not blank
         Push CurArg               ;;push it
       ENDIF
     Endm                          ;;end of repeat block
   Endm                            ;;end of this macro

From this you can create a complementary PopM macro by changing the name,
and also changing the Push instruction to Pop.
   The IRP command works much like a FOR/NEXT loop in BASIC, and tells MASM
to repeat the following statements for each argument that was given.  IFNB
(If Not Blank) then tests each argument to see if it was in fact present
in the incoming list of parameters.  In this case, CurArg assumes the name
of the argument, and the Push instruction is expanded to specify that name.
   There is no disputing that the syntax of a MASM macro is confusing at
best.  Having to enclose some arguments in angle brackets but not others
requires frequent visits to the MASM manual.  Further, a MASM macro is
virtually impossible to debug.  If you write a macro incorrectly or create
a syntax error, MASM reports an error at the line where the macro was
invoked, rather than at the line containing the error in the macro.  It is
not uncommon to receive a number of errors all pointing to the same source
line, with no indication whatsoever where the error really is.
   Now consider how the DefMsg macro operates.  DefMsg begins by defining
a single incoming parameter named Message.  Two local labels--MsgStart and
MsgEnd--are defined, and these are needed so MASM can calculate the length
of the messages.  Although labels within a macro do not have to be declared
as local, you would get an error if the macro were used more than once. 
Like BASIC, the assembler requires that each label have a unique name.  By
using local labels MASM generates a new, unique internal name for each
macro invocation, instead of the actual label name given.
   The next statement increments a MASM variable named NumMsg.  To avoid
an error caused by calling Message$ with an invalid message number, it
compares the number you pass to the number of messages that are defined. 
This test occurs in the fourth line of the procedure, at the Cmp CX,NumMsg
statement.  NumMsg is a constant, except it may be redefined within the
routine.  (When a constant is assigned using the word Equate, its value
may not be changed by either your source code or by a macro.)  But when a
variable is defined using an equals sign (=), MASM allows it to be altered
as it assembles your program.  Understand that the resulting number is
added to your program as a constant.  However, its value can be changed
during the course of assembly.  Therefore, each time DefMsg is invoked, it
increments NumMsg.  MASM places the final value into the Cmp instruction,
as if you had defined it using a fixed known value.
   The IFB (If Blank) test checks to see if DefMsg was given a parameter
when it was invoked.  In most cases you will probably want to define a
series of consecutive messages.  As it is used here, seven different day
names are returned in sequence.  But there may be times when you want to
leave a particular message number blank.  For example, you could create a
series of messages that correspond to BASIC's error numbers.  BASIC file
error numbers range from 50 through 76, but there are no messages numbers
60, 65, or 66.  You could therefore leave those blank, and invoke a
modified copy of Message$ like this:

   CALL DOSMessage$(51 - ERR)

When DefMsg is used with no argument, it merely creates a zero word at
that point in the code segment.  Otherwise, the length of the message is
stored, followed by the message text.  The statement DW MsgEnd - MsgStart
is replaced with the difference between the addresses, which MASM
calculates for you.  This is similar to the earlier example that showed how
a dollar sign ($) can simplify defining strings that may change.
   The last macro I will describe here is Rept, which means "Repeat the
following statements a given number of times".  In the simplest sense, Rept
could be used to generate a series of the same instructions:

   Rept 100
     Xor  AX,AX
     Push AX
     Call SomeProc
   Endm

A Rept macro is not invoked by name; rather, it is added inline to a
program (or included within a macro that is called by name).  In most cases
you would use a coding loop to repeat a block of code, since a Rept macro
actually generates the same code repeatedly in the program.  But there are
situations where timing is very critical, and a loop is always somewhat
slower than a sequence of inline instructions.
   Another good use for Rept is in conjunction with redefinable equates,
such as this example which defines the letters of the alphabet:

   Alphabet:
   Char = 0
   Rept 26                ;;do this 26 times
     DB "A" + Char        ;;define ASC("A") + Char
     Char = Char + 1      ;;increment Char
   Endm

Although the MASM manual states that you must use double semicolons for
remarks within a macro as shown here, I have used a single semicolon
without problems.
   There are other macro commands and features I will not describe here,
because I have not found them to be particularly useful.  However, macros
can be recursive, multiple macros may be nested, and even redefined on the
fly.  I urge you to refer to the documentation that Microsoft provides for
more information on those advanced features.


SEGMENT NAMING
==============

Aside from the short PrtSc example shown earlier in this chapter, we have
relied upon MASM's simplified segmentation directives to spare us from the
nuisance of defining and naming segments.  Indeed, when writing routines
that will be added to BASIC it is rarely necessary to do this manually, so
why bother?
   One place where naming segments explicitly is useful is when you have
many internal procedures that are never called from BASIC directly.  If,
for clarity and organization reasons, you decide to store those routines
in different files, you still may want to access the routines using near
calls.  Since a near call is two bytes shorter than a far call and also
operates slightly faster, this can make a difference when there are many
Call commands within the routines.
   As LINK pulls all of the various pieces of your program together from
separate object and library files, it reads the segment names and combines
those with the same name.  Thus, a routine in one source file can call a
routine in a different file, and LINK will place both routines into the
same segment if they use the same segment name.  This is of course needed
to ensure that the called routine is reachable by the caller (within 64K).
   All of the standard segment names that Microsoft recommends are listed
in the MASM manual, along with instructions for creating your own names. 
Therefore, I won't belabor that here.


ACCESSING BASIC INTERNALS
=========================

In preceding sections you learned that it is possible--even desireable--
to call BASIC's internally routines directly.  Besides those that have
already been described, there are several other useful routines that can
be accessed from assembly language.  One of these is B_ONEXIT, which lets
you tap into BASIC's termination procedure.
   When a BASIC program ends by running out of statements, or by using END,
STOP, or SYSTEM, BASIC makes a call to a central routine that in turn tells
DOS to end the program.  If a fatal error occurs and there is no ON ERROR
handler, BASIC also calls a routine that prints an error message.  B_ONEXIT
lets you tell BASIC the segment and address of a routine you want called
as part of the termination process.  B_ONEXIT is supported only in
QuickBASIC version 4.5 and BASIC PDS.
   One reason you might want to use B_ONEXIT is to ensure that interrupts
taken over by your assembler routine are restored properly.  Taking over
interrupts will be described later in the section "Handling Interrupts." 
Here's a program fragment showing how B_ONEXIT is set up and called:


Extrn B_ONEXIT:Proc     ;declare B_ONEXIT as external
Push CS                 ;pass your code segment
Mov  AX,Offset TermProc ;and the address of the routine
Push AX                 ;  that is to be called
Call B_ONEXIT           ;register it with B_ONEXIT
  .
  .

TermProc Proc           ;this is the routine to be called
  .                     ;do whatever you need to here
  .
  Ret                   ;don't forget to return!
TermProc Endp


BASIC's INTERNAL DATA

There are two internal variables BASIC maintains that you will find useful. 
One is the current DEF SEG setting, and it is stored in the integer
variable named B$SEG.  The other is the current color value that is used
by PRINT and CLS.  The foreground and background colors are stored combined
in a single word named B$FBColors.  The reason these are useful is because
you may want to change and then restore them from inside a BASIC
subprogram.  Much of the benefit of reusable programming is lost if you
cannot put things back to the way they were originally.
   For example, if you have written a BASIC routine that prints an error
message in bright red at the bottom of the screen, you will need to use a
subsequent COLOR command to put the color back to what it had been.  But
what color do you use?  The same holds true for a routine that changes the
current DEF SEG setting, perhaps before loading or saving a file using
BLOAD or BSAVE.  If you cannot return that to its original value, extra
work is needed in the main program each time the routine is used.
   Access to B$SEG requires a single assembler instruction, as shown in the
complete GetSeg function shown following.  Declare and use GetSeg like
this:

   DECLARE FUNCTION GetSeg%()
   SavedSeg = GetSeg%
    .
    .
   DEF SEG = SavedSeg


;GETSEG.ASM
.Model Medium, Basic
.Data
  Extrn B$Seg:Word

.Code
GetSeg Proc

  Mov  AX,B$Seg   ;load the value from B$Seg
  Ret             ;return with the function output in AX

GetSeg Endp
End


Because BASIC combines its colors into a single word, a few extra steps
are needed to separate them.  Call GetColor like this:

   CALL GetColor(FG%, BG%)

FG% and BG% are returned to you holding the current foreground and
background color values.  Here's how GetColor works:


;GETCOLOR.ASM
.Model Medium, Basic
.Data
  Extrn B$FBColors:Word

.Code

GetColor Proc, FG:Word, BG:Word

  Mov  DX,B$FBColors    ;load the combined colors
  Mov  AL,DL            ;copy the foreground portion
  Cbw                   ;convert it to a full word
  Mov  BX,FG            ;get the address for FG%
  Mov  [BX],AX          ;assign FG%
  Mov  AL,DH            ;load the background portion
  Mov  BX,BG            ;get the address for BG%
  Mov  [BX],AX          ;assign BG%
  Ret                   ;return to BASIC

GetColor Endp
End


One unfortunate problem is that GetColor cannot be used in the editing
environment.  When BASIC compiles a PEEK or POKE statement, it generates
inline code that loads ES with the segment from B$SEG, and then reads or
writes the data at the specified address.  Therefore, the current segment
must be available to BASIC routines that use PEEK or POKE in a Quick
Library.  But the color values are accessed only by routines in BASIC's
runtime library, so the information is not made available to procedures in
a Quick Library.  Because of this issue, the GetColors routine is provided
on the accompanying disk only in the BASIC.LIB and BASIC7.LIB linking
libraries.
   There are several other internal data items you may want to know about,
and one that I have found useful is called __osversion.  This byte holds
the major DOS version number; for example, if DOS 3.x is running then
__osversion will hold the value 3.  Even though it is trivial to query DOS
for the number, why bother since you can get it this way with a single Mov.


BASIC's INTERNAL ROUTINES

Besides the procedures and internal data I have described previously, there
are many others you will no doubt find useful.  You can, for example, call
SETMEM prior to claiming memory from DOS.  And although the B$ASSN routine
can assign any type of data from any other type including strings, a
simplified version is also present to assign to and from conventional
strings only.
   As you have seen, the beauty of using BASIC's own routines is that
identical code can be used for both near and far strings.  In either case,
the string descriptors are known to reside in DGROUP, and the internal
routines are designed to operate on those descriptors.  You don't even have
to know which of the string libraries (near or far) is being used.
   There are also several math routines that can be accessed directly,
including those that multiply, divide, and compare long integers.  Even if
you know how to do that, it's always easier to call BASIC's routines.  This
result in less code as well.  And if you need to read the current cursor
position, you can access CSRLIN and POS(0) directly.  In some cases, you
can't read that information from the BIOS, so calling BASIC is the only
reliable way to get it.
   The following section documents the BASIC internal routines that I have
found useful when called from assembly language.  I have purposely omitted
routines that handle BASIC commands such as PRINT, INKEY, GET, and PUT. 
Even though several of these were described throughout the course of this
book, they have little relevance within a called assembler routine.
   BASIC's internal services that follow are listed in alphabetical order,
based on their call names.  Be sure to declare them as external procedures
in your routine's source code.


B$CPI4: Compare Two Long Integers

B$CPI4 expects two long integer arguments to be placed onto the stack by
value, and it returns the result of its comparison in the Flags register. 
For example, to see if Var1 is greater than Var2 you'd use code like this:

   Push Word Ptr [Var1+2]   ;first push Var1's high word
   Push Word Ptr [Var1]     ;and then its low word
   Push Word Ptr [Var2+2]   ;next do the same for Var2
   Push Word Ptr [Var2]
   Call B$CPI4              ;compare them
   Jg   Label               ;Var1 is indeed greater

Remember that long integers are compared by BASIC on a signed basis, so
you should use Jg or Jl rather than Ja or Jb.  The letters CPI4 stand for
Compare Integer 4 bytes.


B$CSRL: CSRLIN Function

B$CSRL is called with no arguments, and it returns BASIC's current row in
AX as follows:

   Call B$CSRL
    .                       ;do what you want with AX


B$DVI4: Divide Two Long Integers

Like B$CPI4, B$DVI4 (Divide Integer 4 bytes) expects the incoming integer
arguments to be passed by value on the stack.  The result is then returned
in DX:AX as a long integer:

   Push Word Ptr [Var2+2]   ;always push the high word first
   Push Word Ptr [Var2]     ;then the low word
   Push Word Ptr [Var1+2]   ;ditto for Var2
   Push Word Ptr [Var1]
   Call B$DVI4              ;divide them
    .                       ;now DX:AX holds Var1 \ Var2

Notice that with B$DVI4, the divisor is pushed first onto the stack,
followed by the dividend.


B$FPOS: POS(0) Function

Even though the argument passed to BASIC's POS(0) is ignored, it is still
expected mainly for historical reasons.  Therefore, you must push
something--anything--onto the stack before calling B$FPOS:

   Push AX
   Call B$FPOS
    .                       ;now AX holds the column


As with all of BASIC's functions that return an integer, B$FPOS returns
the current column in AX.  The leading F in FPOS stands for Function.


B$FRI2: FRE() Function

B$FRI2 (Free Integer 2 bytes) requires an incoming integer argument by
value on the stack, and for safety you should use this for the -1 and -2
variations only.
   Using -1 reports the total amount of memory that is available to BASIC,
so you might use this before calling SETMEM to release memory for your own
uses.  Although B$FRI2 uses an integer for an argument, it returns a long
integer in DX:AX.  You can also use an argument of -2 to see how much stack
space is available:

   Mov  AX,-2
   Push AX
   Call B$FRI2
   .              ;now DX:AX holds the available stack space


B$RDIM: REDIM Statement

In most cases you will probably not find the ability to call REDIM directly
very valuable.  One notable exception is explained later in the section
entitled "Reading the Array Descriptor," where I show how to size and then
load a string array with all of the files that match a given search
specification.
   B$RDIM is fairly complicated to set up and call, because it accepts a
varying number of parameters.  This is needed because BASIC accepts a
variable number of dimensions, and the same routine is used for all cases. 
The following example shows how to prepare and call this routine when
resizing a one-dimensional array.

   Mov  AX,LBound           ;first pass the lower bound value
   Push AX
   Mov  AX,UBound           ;then pass the upper bound
   Push AX
   Mov  AX,ElementLength    ;next the length of each element
   Push AX
   Mov  AX,Features         ;see the accompanying text for
   Push AX                  ;  information on these two items
   Mov  AX,Offset ArrayDescriptor
   Push AX
   Call B$RDIM              ;call REDIM to do it

Chapter 2 described the array descriptor in detail, including the Features
word.  However, you must not use REDIM to create a new array where none
existed before.  Instead, you will read the current features from the
existing array descriptor, and pass the same values on again to B$RDIM. 
This will be shown in context momentarily.


B$STDL: String Delete

You can call B$STDL to delete a string or string array element, and it
requires less code than assigning the string from another, null string. 
The single argument is the address of a string descriptor:

   Mov  AX,Offset Descriptor
   Push AX
   CALL B$STDL


B$SETM: SETMEM Function

B$SETM expects a long integer argument by value on the stack; if the value
is negative then that much memory is released back to DOS, and thus taken
from your BASIC program.  However, you should call B$SETM again later with
a positive value when you are finished, so the BASIC program can reclaim
that memory.  Since SETMEM is a function, B$SETM also returns the amount
of memory currently available in the DX:AX register pair.


B$SASS: String Assign

Where B$ASSN is capable of assigning any mix of conventional and fixed-
length strings, B$SASS works with conventional strings only.  However, it
requires only two parameters instead of six:

   Mov  AX,Offset Source$
   Push AX
   Mov  AX,Offset Destination$
   Push AX
   CALL B$SASS

Note that if the destination string is not null, its current contents are
released after assigning it from Source$.  This is the normal way that
strings are assigned, and B$ASSN also works like this.


Finding Other Routines

The routines just described are those that I personally have found to be
useful.  Discovering other routine names and how they are called is in
fact quite simple.  If you wanted to access, say, COMMAND$, you would write
a one-line BASIC program, and then examine the code that is generated using
Microsoft CodeView.  CodeView lets you see which and how many parameters
are being passed as well as the routine name being called, making
exploration both easy and fun.
   BASIC string functions such as COMMAND$ and ENVIRON$ return the DGROUP
address of the result string descriptor in AX, just like an assembly
language function you would write.  If you do call a built-in BASIC
function, be sure to also pass its output descriptor to B$STDL (String
Delete) when you are done with it.  Otherwise, the string space it uses
[and the temporary output descriptor] will never be released.


READING THE ARRAY DESCRIPTOR

Chapter 2 described the BASIC array descriptor in detail, and discussed
each of the components it contains.  Understanding how an array descriptor
works opens many opportunities to assembly language programmers, because
it lets you write routines that accept an array passed with empty
parentheses.  This was shown in the Sort routine introduced in Chapter 8,
although the techniques used there were not detailed.
   As an example of the possibilities direct access to an array descriptor
offers, I will show a subroutine that accepts a file specification, and
returns a string array filled with the names of all matching files. 
GetNames calls upon three internal BASIC routines: B$FLEN, B$RDIM, and
B$ASSN.  B$FLEN returns the length of a string, and is used here to know
how long the file specification is.  B$RDIM redimensions the passed string
array to the correct number of elements, based on the number of matching
file names that are found.  B$ASSN then assigns each element to those
names.
   This next short BASIC program shows how GetNames is set up and used.


DECLARE FUNCTION GetNames%(Array$())
REDIM Array$(1 TO 1)            'use REDIM, not DIM
Array$(1) = "*.*"               'any valid spec is okay
NumFiles% = GetNames%(Array$()) 'load all names at once

IF NumFiles% = 0 THEN           'were any files found?
  PRINT "No matching files."    'no, say so and end
  END
END IF

FOR X% = 1 TO NumFiles%         'yes, print each name
  PRINT Array$(X%)
NEXT
PRINT NumFiles; "matching files were found"


As you can see, you must establish the array initially using REDIM.  To
avoid the need for an extra parameter, the file specification is passed in
the first element of the array.  Furthermore, GetNames returns the number
of files that matched as an integer result.  If no files were encountered,
GetNames leaves the array as it was.
   When GetNames is called, the array may already contain other data, and
it can have any legal upper and lower bounds.  As long as the lowest
element number contains a valid search specification, the spec can be found
and the array will be redimensioned starting at element number one.  The
GETNAMES.BAS demonstration program on the accompanying disk adds to this
short example by sorting the names after they are read.
   A complete description of how GetNames works follows this source
listing.

;GETNAMES.ASM, loads a group of file names into an array

.Model Medium, Basic
  Extrn B$RDIM:Proc       ;this redimensions an array
  Extrn B$ASSN:Proc       ;this assigns a string
  Extrn B$FLEN:Proc       ;this returns a string's length

  DTAType Struc           ;define the DOS DTA structure
    Intern  DB 21 Dup (?) ;this is used by DOS internally
    FAttr   DB ?          ;this holds the file attribute
    FTime   DW ?          ;this holds the file time
    FDate   DW ?          ;this holds the file date
    FSize   DD ?          ;this holds the file size
    FName   DB 13 Dup (?) ;this holds each file name
  DTAType Ends

.Data
  DTA DB Size DTAType Dup (?) ;DOS places file info here
  NumFiles   DW 0             ;how many names were read
  SpecLength DW 0             ;remembers file spec length

.Code

GetNames Proc Uses SI DI, Array:Word

  Local Buffer[80]:Byte  ;copy the spec here, add a zero

;-- Create a local Disk Transfer Area for our own use.
  Lea  DX,DTA            ;show DOS where the new DTA goes
  Mov  AH,1Ah            ;set DTA service
  Int  21h               ;call DOS to do it

;-- Read the array descriptor, get the search spec from the first element,
;   then copy it to the stack appending a CHR$(0) byte (ASCIIZ string).
  Mov  SI,Array          ;get address of array descriptor
  Mov  BX,[SI+0Ah]       ;now BX holds adjusted offset
  Mov  AX,4              ;each element is four bytes long
  Mul  Word Ptr [SI+10h] ;multiply by first element number
  Add  BX,AX             ;BX holds first element's address

  Push DS                ;push source segment and address
  Push BX                ;  for call to B$ASSN later on
  Xor  AX,AX             ;tell B$ASSN source is descriptor
  Push AX                ;using a value of zero

  Push BX                ;pass descriptor addr to B$FLEN
  Call B$FLEN            ;this returns the length in AX
  Mov  SpecLength,AX     ;save length locally for a moment

  Lea  AX,Buffer         ;get the destination address
  Push SS                ;pass the segment to assign into
  Push AX                ;and then the address
  Push SpecLength        ;we're assigning a fixed length
  Call B$ASSN            ;copy the file spec to the stack

  Lea  BX,Buffer         ;retrieve start address of spec
  Mov  DX,BX             ;copy to DX where DOS expects it
  Add  BX,SpecLength     ;point just past end of string
  Mov  Byte Ptr [BX],0   ;and append trailing zero byte

;-- Count the number of names that match the search specification.
  Mov  AH,4Eh            ;specify Find First matching name
  Mov  CX,00100111b      ;this matches any type of file
  Xor  BX,BX             ;BX counts the number of names

CountNames:
  Int  21h               ;see if there's a matching name
  Jc   DoneCount         ;carry set means no more names
  Inc  BX                ;otherwise, we found another one
  Mov  AH,4Fh            ;find the next matching name
  Jmp  CountNames        ;continue until there are no more

DoneCount:
  Mov  NumFiles,BX       ;remember how many files we found
  Or   BX,BX             ;did we fail on the first name?
  Jz   Exit              ;yes, return a count of zero

;-- Now that we know how many file names there are, REDIM the string array.
  Mov  AX,1              ;specify an LBOUND of 1
  Push AX                ;pass that on to B$RDIM
  Push BX                ;and pass on the new UBOUND value
  Mov  AL,4              ;each descriptor takes four bytes
  Push AX                ;pass that on too

  Mov  BX,Array          ;get array descriptor again
  Mov  AX,[BX+08]        ;load the existing Features word
  Push AX                ;use that again for this call
  Push BX                ;show where array descriptor is
  Call B$RDIM            ;finally, redimension the array

;-- This is the main processing loop that reads and assigns each name
;   that is found.
  Mov  AH,4Eh            ;specify Find First matching name
  Lea  DX,Buffer         ;load address of file spec again
  Mov  BX,Array          ;get array descriptor address too
  Mov  BX,[BX+0Ah]       ;reload the adjusted offset value
  Add  BX,4              ;BX is first descriptor address

Do:
  Mov  CX,00100111b      ;specify any type of file again
  Int  21h               ;see if there's a matching name
  Jc   Exit              ;carry set means no more names
  Push BX                ;otherwise, save the address

;-- Search for the zero that marks the end of this name.
  Mov  DI,Offset DTA.FName
  Push DS                ;in anticipation of call below
  Push DI                ;DI too while the address handy

  Push DS                ;ensure that ES=DS
  Pop  ES
  Mov  CL,13             ;search up to 13 characters
  Repne Scasb            ;do the search
  Mov  AL,CL             ;save the remainder in AL

  Mov  CL,13             ;calc number of chars to copy
  Sub  CL,AL             ;the answer is now in CX
  Dec  CX                ;don't include the zero byte
  Push CX                ;pass that on to B$ASSN

  Push DS                ;show where destination string is
  Push BX
  Xor  AX,AX             ;zero means B$ASSN is assigning
  Push AX                ;  to a conventional string
  Call B$ASSN            ;assign this element to the name

  Pop  BX                ;retrieve the descriptor address
  Add  BX,4              ;point to the next element
  Mov  AH,4Fh            ;specify Find Next matching name
  Jmp  Do                ;and keep on keepin' on

Exit:
  Mov  AX,NumFiles       ;assign the function output
  Ret                    ;return to BASIC

GetNames Endp
End

GetNames begins by declaring the three BASIC routines it will call as being
external.  Next the DTA structure is defined, to simplify access to the
file name address when it assigns each element in the string array.  The
only data items are the DTA itself, two working variables, and the local
stack buffer.  Since the incoming file specification needs to be converted
to an ASCIIZ string for DOS, GetNames copies that specification into Buffer
and then appends a CHR$(0) zero byte to the end.
   Once the DTA has been established, the next step is to read the file
specification passed in the first element, and copy it into local storage. 
B$FLEN is used to obtain the length of the string, so GetNames will know
how far into the buffer the zero byte will be placed.  The last preparatory
steps call B$ASSN telling it to copy from a conventional string (the array
element) to a fixed-length string (Buffer), and then store the zero byte.
   The actual body of the program is broken into two portions.  The first
simply calls DOS repeatedly to count the file names, to know how many
elements are needed.  The count is then saved in NumFiles; if none were
found GetNames exits without doing anything else.  Otherwise, the incoming
string array is redimensioned from 1 to the number of files.
   The second portion again reads each file name through DOS, but this time
the names are actually assigned to the array elements using B$ASSN.  This
time, however, B$ASSN assigns a conventional string from the fixed-length
string portion of the DTA.  Since the source is now of a fixed-length,
GetNames needs to know how long each name is.  The longest possible name
is 13 bytes long (eight for the name, a period, three for an extension, and
one more for the terminating zero byte).  Therefore, ES:DI is set to point
to the start of the DTA, AX is set to zero to search for the zero byte, and
CX is loaded with the number of characters to scan.
   Once the zero is found--and it always will be--the count that remains
in CX is subtracted from 13 to obtain the actual length of the current
name.  Because that calculation includes the unwanted CHR$(0), CX is
decremented by one.
   There is one small related trick that bears explaining.  Just before the
call to B$RDIM, AX is loaded with the number 1, to specify that as the
first element number.  This three-byte instruction sets AL to 1, and clears
AH to 0.  Three lines below that only AL is assigned, which is sufficient
because we know that AH is already zero.  Because the number being assigned
is one byte long, assigning AL requires only two bytes.
   Admittedly, the savings is small, but the affect on code readability is
minimal once you know about such tricks.  And a byte saved is always
welcome in assembly language programming.  The same trick is used when
setting CL to 13, where CH is known to be zero after assigning the file
attribute of 00100111b to all of CX.


HANDLING INTERRUPTS
===================

The last programming technique I want to describe is writing an interrupt
handler you can attach to a BASIC program.  There are several applications
for this, such as tapping into the timer interrupt to display an on-screen
clock.  Instead of having to constantly print TIME$ during your INKEY$
input loops, such a routine would act as a sort of TSR, getting control at
each timer tick and displaying the time automatically.
   The example I will show here takes over the keyboard interrupt, and
disables the Ctrl-Alt-Del key sequence.  This lets you prevent rebooting
with its corresponding loss of data, should someone press those keys
inadvertently (or on purpose!).  NoReboot is called as follows:

   CALL NoReboot(BYVAL InstallFlag%)

If InstallFlag is non-zero, you are telling NoReboot to install itself and
take over the keyboard interrupt to prevent rebooting.  An argument of
zero instead unhooks the interrupt, and re-enables those keys.  Although
you could certainly modify NoReboot to use BASIC's B_ONEXIT service to
deinstall itself automatically, I have left that feature out on purpose in
the interest of clarity.  This also lets you activate NoReboot selectively
in your program, since there is no way to revoke a request to B_ONEXIT.

;NOREBOOT.ASM, traps Ctrl-Alt-Del within a BASIC program

.Model Medium, Basic
.Code

NoReboot Proc Uses DS, InstallFlag:Word

  Cmp  InstallFlag,0     ;are they asking to install?
  Je   Deinstall         ;no, so deinstall it

  Cmp  CS:Old9Seg,0      ;yes, are we already installed?
  Jne  Exit              ;yes, and don't do that again!

  Mov  AX,3509h          ;ask DOS for current Int 9 vector
  Int  21h               ;DOS returns it in ES:BX
  Mov  CS:Old9Adr,BX     ;save it locally
  Mov  CS:Old9Seg,ES

  Mov  AX,2509h          ;point Int 9 to our own handler
  Mov  DX,Offset NewInt9
  Push CS                ;copy CS into DS
  Pop  DS
  Int  21h

Exit:
  Ret                    ;return to BASIC


;-- Control comes here when a key is pressed or released.
NewInt9:
  Sti                    ;enable further interrupts
  Push AX                ;save the registers we're using
  Push DS

  In   AL,60h            ;read the keyboard scan code
  Cmp  AL,83             ;is it the Delete key?
  Jnz  Continue          ;no, continue on to the BIOS

  Xor  AX,AX             ;see if Alt and Ctrl are pressed
  Mov  DS,AX             ;by looking at address 0:417h

  Mov  AL,DS:[417h]      ;get shift status at 0000:0417h
  Test AL,8              ;is Alt key depressed?
  Jz   Continue          ;no, continue on to the BIOS
  Test AL,4              ;is Ctrl key depressed?
  Jz   Continue          ;no, continue on to the BIOS

  In   AL,61h            ;send an acknowledge to keyboard
  Mov  AH,AL             ;otherwise the Ctrl-Alt-Del
  Or   AL,80h            ;  keystroke will still be
  Out  61h,AL            ;  hanging around the next time
  Mov  AL,AH             ;  a program asks for a key
  Out  61h,AL
  Mov  AL,20h            ;indicate end of interrupt to the
  Out  20h,AL            ;  8259 interrupt controller chip

  Pop  DS                ;ignore, simply return to caller
  Pop  AX
  Iret                   ;use this special Ret when
                         ;  returning from an interrupt
Continue:
  Pop  DS                ;restore the saved registers
  Pop  AX
  Jmp  DWord Ptr CS:Old9Adr   ;continue on to the BIOS by
                              ;  jumping to the address
                              ;  that was saved during
                              ;  initialization
DeInstall:
  Mov  AX,2509h          ;restore original Int 9 handler
  Mov  DX,CS:Old9Adr     ;from segment and address saved
  Mov  DS,CS:Old9Seg     ;  earlier
  Int  21h               ;DOS does this for us
  Mov  CS:Old9Seg,0      ;clear this as an installed flag
  Jmp  Short Exit        ;and then exit back to BASIC

NoReboot Endp

  Old9Adr   DW 0         ;remembers original Int 9 address
  Old9Seg   DW 0         ;these must be stored in the code
                         ; segment because DS is undefined
                         ; when NewInt9 receives control
End

The first thing NoReboot does is look to see if the caller is installing
or deinstalling.  If installation is requested, the saved Interrupt 9
segment is checked, to be sure that it holds the initial value of zero. 
It is important to prevent multiple installations, because installing saves
the current interrupt handler's address.  If NoReboot installed itself
twice, it would save its own address on top of the original BIOS handler's
saved address.  And once that address is lost, it is impossible to restore
it again later.
   Assuming it is safe to be installed, the next step is to ask DOS for the
current interrupt handler's address using service 35h.  This service
expects the service number in AH, and the interrupt number in AL.  To save
a byte, both values are loaded at once.  Service 35h returns the segment
and address in ES:BX, and these are saved in the code segment.  Because the
original address will be called from within the interrupt handler, CS is
the only register whose contents are known.  Accessing data in DGROUP is
more difficult, because an interrupt can occur at any time, and DS will
likely not be holding the correct segment.  [That is, execution could be
at any point in the program when Ctrl-Alt-Del is pressed, including within
a routine that has changed DS.  So when NoReboot receives control it can't
be certain that DS holds the segment for .Data variables it has defined.]
   Once the original interrupt handler address has been saved, NoReboot
calls DOS again, but this time to assign the segment and address of its
replacement handler in the interrupt vector table.  It is easy to access
the interrupt vector table directly using Mov instructions, but it is even
easier to have DOS do that.
   Finally, NoReboot returns to the calling BASIC program, and all
subsequent key presses are now routed to the NewInt9 procedure.
   NewInt9 must perform a few tricks, partly because it is handling a
hardware interrupt.  All interrupt handlers begin with the instruction Sti,
which tells the 8088 to allow further interrupts to occur and be processed. 
Next, the two registers being used are saved on the stack, so they can be
restored again later.  Because a keyboard interrupt can occur at any time
interrupting the process that is currently running, it is imperative that
you not alter any aspect of the 8088's current state.  This includes the
settings of the Flags register as well.  However, the Flags register is
saved automatically by the 8088 as part of its handling of interrupts, so
the flags don't have to be saved or restored manually using Pushf and Popf.
   The next sequence of instructions reads the key that was pressed from
the keyboard's I/O port (60h), and compares that to the scan code for the
Del key.  If any other key was pressed, NoReboot jumps to the original
keyboard handler in the ROM BIOS.  Otherwise, it examines low memory to see
if both the Ctrl and Alt keys are also currently pressed.  Unless all three
conditions are met, control passes on to the BIOS.  But if Ctrl-Alt-Del is
pressed, NoReboot handles the keystroke entirely on its own and ignores it. 
In that case DS and AX are restored, and NoReboot exits back to the
underlying program.
   Notice the special form of return command, Iret (Interrupt Return). 
Like a conventional far return, Iret pops the address and segment to return
to from the stack, but it also pops the Flags register that was stored
there by the 8088 automatically.
   The final section of code restores the original interrupt vector, and
clears the Old9Seg variable to zero.  This lets NoReboot know that it is
not installed, in case you call it again later.
   This same technique can be applied to handle other interrupt services,
and I encourage you to experiment on your own.  You could, for example,
write a routine that takes over the communications interrupt, and displays
a flashing box in a corner of the screen whenever characters are received. 
Likewise, you could modify this routine to create an on-screen display of
the Caps Lock and Num Lock state.  Each time one of those keys is pressed
you would either print or clear a status message.


DEBUGGING WITH CODEVIEW
=======================

As useful as CodeView can be for a purely BASIC program, it is even more
necessary when writing in assembly language.  CodeView lets you step
through the code that BASIC generates to set up and call your subroutine,
and then step through the routine a line at a time.  Being able to watch
your program as it executes helps you to quickly zero in on any problems. 
Further, CodeView shows you the current CPU register contents, as well as
the value of memory locations about to be read from or written to.
   To debug an assembly language subroutine with CodeView, you must first
assemble it using the /Zi option switch:

   masm routine /zi;

Then you link the routine to your BASIC program using the /Co option.  Of
course, the BASIC program must also have been compiled using /Zi:

   bc program /o /zi;
   link program routine /co;

Finally, you start CodeView specifying the name of the BASIC program:

   cv program

Once the BASIC source code is showing on the screen you can step and trace
through it as described in Chapter 4.  As with BASIC subprograms and
functions, to step into an assembler routine you press F8 at the CALL
statement.  If the routine is designed as a function you instead press F8
at the line in which the function is referenced.
   Once CodeView has traced into the routine, you can press F3 to view the
source code only, the assembly code only, or both intermixed.  I usually
prefer to view only my original source, but that hides the data memory
addresses that MASM and LINK assigned.  Usually you will not need to know
those addresses, but there are times when this can be helpful.  For
example, when a program is not working correctly, the bug could be caused
by a different portion of the program overwriting the named variables.
   Besides the F3 key, you can also use F4 and F7, and these have the same
meaning as the same keys when used in the BASIC editor.  Indeed, debugging
an assembly language subroutine is quite similar to debugging a BASIC
program as far as which keys are used.


MASM 6.0 ENHANCEMENTS
=====================

All of the discussions in this chapter have focused on using MASM version
5.1.  However, Microsoft's more recent version 6.0 introduces a number of
significant changes and new features.  Perhaps the most useful new feature
in this release is the greatly improved documentation.  The manuals that
came with past versions of MASM were very dry, containing reams of facts
but no practical advice or guidance.  The new documentation include both
facts and programming tips, and this addition is welcome indeed.
   If you already have existing assembly language source code, you may have
to change it to accommodate the new MASM 6.0 conventions.  In particular,
MASM's handling of data structures has changed substantially, and in many
cases code that used to work correctly no longer does.  However, you can
optionally use the /Zm command line switch, to tell MASM 6.0 to behave like
the earlier 5.1 version.
   A new MASM.EXE program launcher is also included to offer a similar
capability.  Where older versions of MASM were named MASM.EXE, the new
program is called ML.EXE.  The MASM.EXE that now comes with MASM 6.0 simply
passes the /Zm option on to ML, along with some other option switches that
are needed to tell ML to mimic the older assembler's behavior.


IMPROVED ASSEMBLY OPTIMIZATIONS

Before MASM 6.0, a conditional jump was limited to a distance no greater
than 128 bytes earlier or 127 bytes farther ahead in the code.  When there
was no way to restructure your code to accommodate this inherent 8088
limitation, you had to use a conditional jump around another unconditional
jump like this:

   ;if AX < 12 go to FarLabel
     Cmp  AX,12              ;compare AX to 12
     Jnl  NearLabel          ;jump if not less over far jump
     Jmp  FarLabel           ;perform the far jump
   NearLabel:
      .                      ;program continues
      .
      .                      ;this label is more than
   FarLabel:                 ;  127 bytes past Jnl

MASM 6.0 avoids this limitation and lets you use Jl to the far label
directly, although it really just replaces your use of Jl with code
equivalent to that shown above.
   Another, similar optimization affects unconditional jumps.  As I
mentioned earlier, each time MASM 5.1 encounters a label in your source
code, it remembers its address in the resultant object code.  Then if you
jump backwards to that label later, MASM knows if it can use the shorter
two-byte form of the Jmp instruction.  But a forward jump to a near label
requires you to explicitly state Jmp Short to obtain this code savings,
since MASM 5.1 does not yet know the target label's address.  Without
Short, MASM 5.1 uses a long jump on a trial basis.  If the jump turns out
to be within the near range MASM goes back and patches the code to a short
jump followed by a byte-wasting Nop (No Operation) instruction.
   MASM 6.0 avoids this problem by processing your source file in multiple
passes.  That is, MASM reads your code and assembles what it can, using far
jumps when the target address has not yet been encountered.  Then it
processes that intermediate code again modifying its earlier output as
appropriate.  If a three-byte jump can be replaced with the two-byte
version, MASM 6.0 rewrites the code sliding subsequent instructions back
a byte.  MASM 6.0 is called an *n-pass assembler*, because as many passes
as needed are performed until the code is as small as possible.


NEW SIMPLIFIED DIRECTIVES

Besides the improved optimizing, MASM 6.0 offers several features borrowed
from high-level languages.  These include .IF, .ELSE, and .ELSEIF; .WHILE
and .ENDW; and .REPEAT and .UNTIL.  Unfortunately, these new constructs are
modeled after the C language, and provide little if any clarification to
BASIC programmers.  For example, you can now write code such as this:

   .IF (AL < "0") || (AL > "9")

which is equivalent to this BASIC statement:

   IF AL < ASC("0") OR AL > ASC("9")

Even worse, the MASM manual does not document each directive showing
precisely what it does to your code.
   Like C, BASIC's AND is replaced with a double ampersand (&&), testing
for equality uses a double equals sign (==), and NOT is replaced with an
exclamation point (!).  Therefore, you could write assembly language source
statements like these next two examples:

   .IF (AX != 14) && (BX < 10) ;IF AX <> 14 AND BX < 10 THEN
     Mov  AX,SomeVar           ;divide SomeVar by CX
     Cwd
     Div  CX
     Mov  SomeVar,AX
   .ENDIF


   .REPEAT
     Mov  AH,1                 ;ask for a keyboard character
     Int  21h                  ;through DOS
   .UNTIL (AL == 13)           ;loop until they press Enter

PROTO and INVOKE are two other new simplified directives, and it's hard
for me to recommend using them for similar reasons.  PROTO mimics C's
function prototype capability, and lets you define a called procedure and
its arguments.  INVOKE then calls that routine passing the arguments you
give it.  To define a procedure called, say, MyProc, you would use PROTO
like this:

   MyProc PROTO Var1:Word, Var2:Word, Var3:DWord

Then to call MyProc you use INVOKE as follows:

   INVOKE MyProc, BX, 100, LongVar

Thus, PROTO and INVOKE are very similar to DECLARE SUB and CALL in BASIC. 
The problem is that you have no way to know what code MASM generates for
this command unless you create a sample program, assemble it, and examine
the result using CodeView.  In particular, how does the value 100 used here
get onto the stack?  As it turns out, assembling the preceding INVOKE
command results in the following code:

   Push BX
   Mov  AX,100
   Push AX
   Push Word Ptr [LongVar+2]
   Push Word Ptr [LongVar]

As you can see, even if AX is holding an important value, its contents are
destroyed when MASM assigns the value 100 prior to placing it on the stack. 
While I applaud Microsoft's attempts to make assembly language easier to
use, such behavior can and will introduce subtle bugs.  These bugs can be
even harder to track down than usual, because you did not make the coding
error, the assembler did!  Since the whole point of programming in assembly
language is to control fully what the CPU is doing, such hidden behavior
can have disastrous effects.
   One new feature that I do find useful, however, is the ability to
continue a line with a trailing comma.  Often, a single source statement
will extend into the comments column, spoiling the appearance of your
listing.  You can now avoid this by placing a comma in the middle of a
logical line, and then continuing the remainder of the statement on the
next line.
   Another very useful feature is MASM 6's ability to accept wild cards on
the command line.  For example, you can assemble all of the files in the
current directory using the command masm *.asm;.


TRICKS OF THE TRADE
===================

The final topic I want to present is a variety of assembly language
programming short cuts and other techniques I have developed over the
years.  In preceding sections you saw how Xor or Sub can be used to clear
a register, using less code than Mov.  And if you know that the high-byte
portion of a register or memory variable is already zero, you can save a
byte by assigning only the lower byte.  And to clear both AX and DX you can
use Xor with AX, and then Cwd to extend the zero into DX using only one
additional byte.  As you might imagine, there are many other ways to be
clever in assembly language.


MINIMIZE CODE TO ACCESS PARAMETERS

When parameters are accessed within an assembly language subroutine, the
usual way to get at them is through BP.  Even when you use MASM's
simplified directives, code to push BP, assign it from SP, and then
reference the address on the stack is added to your program.  In that case,
the steps are simply hidden from you.  Because BASIC (and indeed, every
high-level language) requires you to preserve BP, one byte each is needed
for the Push and Pop instructions.
   You can eliminate that overhead by taking advantage of the fact that the
stack is always kept in DGROUP, and that SS and DS are equal.  The trick
is to use BX as a stack reference, because it doesn't need to be preserved. 
Unfortunately, this precludes using the simplified methods for parameter
access.  But when speed or code size are paramount or you have many
routines, stack addressing via BX affords a real savings.  Here's how you
will design the routine, using an example that accesses an incoming string:

   GetString Proc       ;one parameter, not shown

     Mov  BX,SP         ;address the stack manually using BX
     Mov  BX,[BX+04]    ;get the address for the string
     Mov  CX,[BX]       ;get the length of the string
     Jcxz Exit          ;quit if the string is null
     Mov  BX,[BX+02]    ;get address of first character

   Exit:
     Retf 2             ;specify far return with 2 bytes

   GetString Endp
   End

Because BP has not been pushed onto the stack, the incoming string
descriptor address is at [BX+4] rather than [BX+6].  Other than that, the
remainder of the routine proceeds as usual.


BYTE SAVERS

Another useful trick lets you save a byte when adding two to a variable. 
As you know, Inc and Dec when used with a register are always better than
Add and Sub, because they are one-byte instructions.  Therefore, two Inc
or Dec commands in a row are still better than Add AX,2 which requires
three bytes.  However, you must never do this with SP.  The stack pointer
must always hold an even number, and it is possible that an interrupt could
come along after the first Inc or Dec, but before the second has executed. 
Which brings up a related byte saver.
   If you need only a single word of local stack storage, don't use Sub
SP,2 to allocate the space and Add SP,2 later to clear it.  Instead, simply
use Push AX, or Push with any other register.  Likewise, just before
returning to BASIC, pop any register that doesn't return information, such
as CX or BX.


Rep Always Clears CX

Another trick you can take advantage of is that CX is often zero after a
repeating string command that uses Rep.  Zero is a common value in assembly
language programming, and you can usually save a byte by using a register
instead of a constant zero.  In particular, if you are copying a file name
to a buffer and adding a CHR$(0) to the end, you can use code like this:
    .
    .                   ;set up DS:SI and ES:DI here
   Mov  CX,NumBytes
   Rep  Movsb
   Mov  [DI],CL         ;tack a zero byte onto the end
    .
    .

This trick is made even more valuable by the fact that DI is left pointing
at the byte just past the data that was just copied.  Of course, CX is not
necessarily zero after Repe or Repne, because those forms of Rep can
terminate before CX is exhausted.


Use AX Where Possible

Another little-known fact is that memory operations that use AX are one
byte smaller than equivalent operations on any other register.  That is,
Mov BX,KeyCode results in four bytes of code, whereas Mov AX,KeyCode
creates only three.  I often use the DOS DEBUG program for quick tests,
just to see which sequence of instructions results in less code.  Since
DEBUG does not let you specify a variable name, use [100] or any other
address instead:

   -a 100
   -####:0100 Mov AX,[100]
   -####:0103 Mov BX,[100]
   -####:0107 <press Enter to stop assembling>
   -u 100,106
    ####:0100 A10001      MOV   AX,[0100]
    ####:0103 8B1E0001    MOV   BX,[0100]
   -q

This sample session tells DEBUG to begin assembling at address 100 (the
default for .COM files), and then assemble the two instructions shown. 
When you are done press Enter at the dash prompt, and then unassemble the
results and quit.  As you can see, using AX creates one less byte of code.


Multiplying and Dividing By a Power of 2

Because of the way binary numbers are organized, shifting the bits left
or right can provide a very fast way to multiply or divide by a power of
two.  And because the bit shifting commands can be used with all but the
segment registers, this can also save you from having to copy the data to
AX or DX:AX first.  To divide a register by two simply shift the bits right
one position:

   Shr CX,1

And to multiply by two shift them left:

   Shl SI,1

If you need to multiply or divide by four, eight, sixteen, and so forth,
the shift count must first be placed into the CL register:

   Mov CL,5       ;prepare to divide BP by 32
   Shr BP,CL

On 80186 and later processors you can specify a shift count directly. 
Unfortunately, this doesn't work with an 8088, so CL must be used.  Still,
multiplying and dividing are extremely slow instructions on an 8088, so the
added setup will be more than offset if speed is the primary factor.


Low Memory is at Segment Zero

Another useful byte saver is to treat the BIOS data area in low memory as
being at segment zero, instead of the more commonly used segment 40h.  By
convention, the BIOS data area is said to reside at segment 40h, even
though a number of segment/address pairs can be used to access that data. 
I mentioned this briefly in Chapter 11, in the discussions about using
BASIC's CALL Interrupt.  Since Xor or Sub can be used to clear a register
to zero with one byte less code than assigning it a value of 40, I use this
technique frequently:

   This example generates 9 bytes:
        Xor  AX,AX
        Mov  DS,AX
        Test Byte Ptr [417h],8  ;see if the Alt key is depressed

   And this example creates 10 bytes:
        Mov  AX,40h
        Mov  DS,AX
        Test Byte Ptr [17h],8


Scanning An ASCIIZ String

Because ASCIIZ strings are used in programs that access DOS services,
searching those strings to find the end is a common operation.  For
example, the GetNames function does this to determine the length of each
file name before assigning it to elements in the incoming string array. 
In that routine CX is assigned to 13, which is the maximum length a file
name can be.  Since CX is decremented for each character that is examined,
the length is calculated by subtracting CX from 13, which requires an extra
register.
   As long as you are certain that a zero byte is present, you can use a
clever trick to determine directly the number of bytes that were searched. 
Instead of loading CX with the maximum number of bytes to scan, assign it
to -1.  As each character is searched CX is decremented, which results in
a negative version of the number of bytes.  Then the NOT instruction can
be used to revert that to a positive number:

   Mov  ES,Segment     ;point ES:DI to the start of the data
   Mov  DI,Address
   Cld                 ;ensure that scanning is forward
   Mov  CX,-1          ;set CX to -1
   Mov  AL,0           ;search for a zero byte

   Repne Scasb         ;scan the string
   Not  CX             ;convert to a positive number
   Dec  CX             ;don't include the zero byte itself
   Mov  AX,CX          ;now AX holds the length of the string

As you learned in Chapter 2, BASIC's NOT instruction flips all of the bits,
converting ones to zeros and vice versa.  The assembly language version
works the same way, and can be used with registers or memory locations.


CYCLE SAVERS

Besides savings bytes when possible, most assembly language programmers
also like to save clock cycles.  Every assembler instruction requires a
certain amount of CPU timing cycles to execute, although there are other
factors that also affect the actual throughput of a given piece of code. 
But instructions with the fewest number of clock cycles as published by
Intel are always faster than those that require more cycles.


Move and Store Words Instead of Bytes

One very effective speed enhancement is to copy and store words when
possible, instead of bytes.  On 80286 and later processors, words are moved
and stored as quickly as bytes.  Therefore, moving 50 words is much faster
than moving 100 bytes.  If you know ahead of time how many bytes are going
to be processed and that the number is even, you can simply load CX with
half the value, and use Rep Movsw or Rep Stosw instead of Rep Movsb or Rep
Stosb.  [This trick can be used even if the program runs on an 8088, but
the speedup only occurs with 80286 and later CPUs.]  With only a little
added code you can also use this technique to determine at runtime if an
odd byte needs to be processed.  Here's one way to do that:

   Shr  CX,1     ;divide CX by 2
   Rep  Movsw    ;copy the words
   Jnc  Done     ;the Carry Flag is clear
   Movsb         ;copy the odd byte
   Done:
    .            ;program continues
    .

First, CX is divided by 2, and the odd bit, if there was one, is stored
by the CPU in the Carry Flag.  Then the data words are copied to their
destination.  Finally, the Carry flag is tested and the program either
copies a single additional byte or skips over that command.


A Jump Not Taken is Faster Than One That is

And this brings us to yet another cycle saver.  In some cases the Jnc will
be executed, and in others it will not.  And in most programs, the chances
of either happening are about fifty-fifty.  But if you know ahead of time
that a particular action will happen less often than another, you can take
advantage of another 8088 fact: A jump not taken is always faster than one
that is taken.
   Each time the 8088 jumps to a new location or calls a procedure, it
discards its *pre-fetch queue*.  The pre-fetch queue is a small area of
memory on the CPU itself that holds the next few instructions to be
executed.  In many cases, the 8088 can do several things at once.  So while
it is adding or subtracting numbers, it simultaneously fetches instruction
bytes from your code, in anticipation of what it will do next.  This lets
the CPU act on the subsequent instructions very quickly, because they are
already in its own local on-chip memory.  Just as data in registers can be
accessed faster than data that must be read from memory, so too can
instructions that are already in the CPU.
   But when execution branches to a new location, any bytes present in the
pre-fetch queue are obsolete.  Therefore, the 8088 must read the new bytes
at the new location, which takes additional time.  If you have a routine
that makes a test repeatedly within a loop you should change the logic as
necessary, to branch on the less likely situation.  That is, instead of Jne
you might use Je, or vice versa.


MISCELLANEOUS TECHNIQUES

One very powerful technique you will surely find useful is self-modifying
code.  As its name implies, self-modifying code actually writes new
instructions into its own code segment, and this is useful in a variety
of situations.  For example, if you are writing a routine that accepts a
variable number of parameters this lets you patch the Ret instruction to
be Ret 2, Ret 4, and so forth.
   One warning, however, is related to the pre-fetch queue.  If a byte or
word has already been read into the CPU, changing it in the code segment
has no effect.  Worse, there is no way to know for certain which bytes will
have already been read, because the size of the pre-fetch queue has grown
with each new CPU from Intel.  For example, only four bytes are allocated
for a pre-fetch queue on an 8088, but the 80386 uses 16 bytes.
   In general, if the code you are patching is located at least a few dozen
bytes farther in the program, you should be safe.  Such self-modifying code
was used in the SORT.ASM routine shown in Chapter 8, to let the same code
sort either forward or backward.  There, the bytes that represent Jae and
Jbe were assigned to AL and AH, and the code was patched based on the
incoming sort direction.  Since the patching takes place a hundred or so
bytes earlier in the program, it is unlikely that this routine will fail
with future processors.


Static-Free CGA Text Display

The final technique you will find useful is writing to CGA text mode video
memory without creating a disturbance.  When IBM designed the original CGA
adapter they skimped on the design, using circuitry that shares a single
address line for both the 8088 CPU and the video hardware that updates the
screen.  Even when a program is not reading from or writing to display
memory, that memory is still read periodically by the display adapter and
sent to the monitor.  Therefore, accessing that memory directly from an
assembly language routine creates a disturbing burst of static that is
visible on the monitor.  This is caused by the conflict of the CPU and the
video adapter accessing the same video memory addresses at the same time.
   Newer CGA adapters employ a dual-port design that arbitrates
simultaneous read and write requests, thereby eliminating this problem. 
And, of course, EGA and VGA adapters are much more sophisticated than the
CGA, and fortunately also more common these days.  However, you can avoid
the screen disturbance on older CGA adapters by synchronizing your reading
and writing with the horizontal retrace timing.
   As you undoubtedly know, the image on a CRT is drawn by scanning a
single dot horizontally across each successive row.  This happens so
quickly that the eye perceives the moving dot as an entire image.  After
each row is drawn, the dot is turned off, quickly placed at the start of
the next row below, and then turned on again.  By writing to the screen
only while the dot is turned off you can hide the memory conflicts that
cause static.
   The short code fragment below shows how to synchronize video writing
with the CGA's horizontal retrace.  In a windowing routine that also needs
to read video memory, you would use the same technique just before each
byte or word is read.

    .
    .
   Mov  SI,Descriptor  ;get the incoming descriptor address
   Mov  CX,[SI]        ;the string's length goes in CX
   Mov  SI,[SI+2]      ;and the address of the data in SI

   Mov  AX,&HB800      ;load ES with the CGA video segment
   Mov  ES,AX          ;through AX
   Xor  DI,DI          ;point DI to the upper left corner

   Mov  AH,Color       ;load color parameter (passed BYVAL)
   Jcxz Done           ;don't try to print a null string!

No_Retrace:
   In   AL,DX          ;get the video status byte
   Test AL,1           ;test the horizontal retrace bit
   Jnz  No_Retrace     ;if doing retrace, wait until done
   Cli                 ;disable interrupts until we're done

Retrace:
   In   AL,DX          ;get the status byte again
   Test AL,1           ;are we currently doing a retrace?
   Jz   Retrace        ;no, wait until we are
   Lodsb               ;load the current character
   Stosw               ;store the character and attribute

   Sti                 ;re-enable interrupts
   Loop No_Retrace     ;loop until the string is printed

   Done:
    .                  ;program continues or exits here
    .

The current horizontal retrace status can be read using the In instruction,
and then masking off all but the lowest bit.  To protect against the case
where the print loop is entered just as the retrace is about to end, this
routine waits until a new period has just begun.  This is not unlike the
empty loop used in the benchmark examples in Chapter 9, that waited for a
new system clock cycle to begin.


SUMMARY
=======

In this final chapter you have learned what assembly language programming
is all about, and how it can help you as a BASIC programmer.  There is no
doubt that using assembly language is more tedious than BASIC, but the
overall methods and code structures are similar.
   You learned about the 8088's registers, and why operations that use them
are faster than similar operations on memory variables.  The string
instructions are particularly useful, because they are very small and do
several things at once.  Coupled with the Rep prefix these commands can
replace many separate Mov and Inc and Cmp statements.  You also learned how
to perform simple calculations in assembly language, and an example showed
how to translate simple BASIC integer and floating point expressions.
   This chapter explained how the stack operates, and how procedures are
designed to accept passed parameters.  The new simplified directives
introduced with MASM 5.1 eliminate the need to define segments and figure
parameter stack displacements in your routines.  This chapter also
explained how to call DOS and BIOS interrupts from assembly language.
   You learned how to access every kind of data a BASIC program can pass
to a routine, including near and far strings, integers, and even floating
point values.  The section that described arrays showed how to access both
near and far data, and even huge arrays that span multiple segments.
   Besides conventional called procedures, you also learned how to create
functions that can return any type of data.  Several innovative techniques
were presented, including a method for creating a single procedure that can
work with both near and far strings, and even with different versions of
the BASIC compiler.  Equally innovative are the methods that show how to
write floating point instructions and tie them into BASIC's software
emulator.  And if you are not certain how to code a particular floating
point instruction, you can create a short BASIC program and then examine
its code using CodeView.
   This chapter explained many of MASM's features, such as initialized
data, conditional assembly, and defining structures and macros.  In
particular, macros can greatly simplify coding redundant instructions and
data definitions.  Furthermore, MASM can calculate data addresses and
lengths automatically, reducing your work when the data must be changed
later on.
   Because so many different data items all compete for the same 64K near
memory segment, it is often desireable to store working variables on the
system stack.  Likewise, when large amounts of data are involved, variables
and tables can be stored in the code segment.  Both of these techniques
were described in depth, and accompanying examples showed how to do this
in context.
   Several of BASIC's most useful internal variables and procedures were
described, showing their public names and parameter requirements.  The
GetNames function brought all of this information together, showing how to
read an array descriptor, redimension a string array, and assign individual
elements--all using code that works identically with both near and far
strings.
   You also learned how to write an interrupt handler that can be installed
and deinstalled from within a BASIC program.  The example showed how to
take over the keyboard interrupt; however, the same technique can be
applied to nearly any other hardware or software interrupt as well.
   Finally, this chapter described many useful tricks and techniques that
help to reduce the size of your assembly language routines, and also make
them faster.  Many operations that use the AX register result in less code
than the same operations using other registers.  And when moving or storing
contiguous data, accessing the data as words instead of bytes can sometimes
yield a nearly two-fold speed improvement.  When in doubt about which of
several sequences of code is smaller, you can use the DOS DEBUG utility to
quickly determine that.
